Patchwork [7/8] oe-git-proxy.sh: Add a new comprehensive git proxy script

login
register
mail settings
Submitter Darren Hart
Date Feb. 5, 2013, 10:31 a.m.
Message ID <41381da878685b601c62d446795c38119f08941b.1360059615.git.dvhart@linux.intel.com>
Download mbox | patch
Permalink /patch/44087/
State New
Headers show

Comments

Darren Hart - Feb. 5, 2013, 10:31 a.m.
oe-git-proxy.sh is a simple tool to be used via GIT_PROXY_COMMAND. It
uses BSD netcat to make SOCKS5 or HTTPS proxy connections. It uses
ALL_PROXY to determine the proxy server, protocol, and port. It uses
NO_PROXY to skip using the proxy for a comma delimited list of hosts,
host globs (*.example.com), IPs, or CIDR masks (192.168.1.0/24). It is
known to work with both bash and dash shells.

Signed-off-by: Darren Hart <dvhart@linux.intel.com>
---
 scripts/oe-git-proxy.sh |  125 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 125 insertions(+), 0 deletions(-)
 create mode 100755 scripts/oe-git-proxy.sh
Enrico Scholz - Feb. 5, 2013, 11:16 a.m.
Darren Hart <dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:

> +	$NC -X connect $*

why '$*' but not '"$@*"'?


> +$NC $METHOD $*

ditto


Enrico
Darren Hart - Feb. 5, 2013, 4:20 p.m.
On 02/05/2013 03:16 AM, Enrico Scholz wrote:
> 
> 
> Darren Hart <dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:
> 
>> +	$NC -X connect $*
> 
> why '$*' but not '"$@*"'?
> 
> 
>> +$NC $METHOD $*
> 
> ditto

I'm not familiar with $@*

As for $* versus $@, the issue is how the arguments are presented. $* as
a single word, $@ each argument is quoted separately. I believe I ran
into issues with $@. I haven't had any trouble with $*. Is there a
particular use case where you can see this failing as is?
Enrico Scholz - Feb. 5, 2013, 4:36 p.m.
Darren Hart <dvhart@linux.intel.com> writes:

>>> +	$NC -X connect $*
>> 
>> why '$*' but not '"$@*"'?
>> 
> I'm not familiar with $@*

sorry... I meant "$@"


> As for $* versus $@, the issue is how the arguments are presented. $*
> as a single word, $@ each argument is quoted separately. I believe I
> ran into issues with $@. I haven't had any trouble with $*.

$* is causing trouble all the time because it does not retain whitespaces
or empty parameters.  There are only very few cases, where $* makes sense.


> Is there a particular use case where you can see this failing as is?

"$@" is just the right thing to do in this situation.  E.g. when your
script is called as

| oe-git-proxy.sh "${HOST}" "${PORT}"

and HOST is undefined due to some reason, you will try to connect to
"${PORT}" with $*.  The "$@" will cause nc to complain about the broken
HOST parameter.


Btw...

| exec $NC $METHOD "$@"

would be the school book implementation for the thing you want to do...



Enrico
Darren Hart - Feb. 5, 2013, 5:44 p.m.
On 02/05/2013 08:36 AM, Enrico Scholz wrote:
> Darren Hart <dvhart@linux.intel.com> writes:
> 
>>>> +	$NC -X connect $*
>>>
>>> why '$*' but not '"$@*"'?
>>>
>> I'm not familiar with $@*
> 
> sorry... I meant "$@"
> 
> 
>> As for $* versus $@, the issue is how the arguments are presented. $*
>> as a single word, $@ each argument is quoted separately. I believe I
>> ran into issues with $@. I haven't had any trouble with $*.
> 
> $* is causing trouble all the time because it does not retain whitespaces
> or empty parameters.  There are only very few cases, where $* makes sense.
> 
> 
>> Is there a particular use case where you can see this failing as is?
> 
> "$@" is just the right thing to do in this situation.  E.g. when your
> script is called as
> 
> | oe-git-proxy.sh "${HOST}" "${PORT}"
> 
> and HOST is undefined due to some reason, you will try to connect to
> "${PORT}" with $*.  The "$@" will cause nc to complain about the broken
> HOST parameter.
> 
> 
> Btw...
> 
> | exec $NC $METHOD "$@"
> 
> would be the school book implementation for the thing you want to do...
> 
> 
> 
> Enrico

That all makes sense. When I read up the difference again in the bash
documentation I was surprised I had used $*, but thought I had done that
dance already. I'll update with "$@" and do some tests.

Thank you for the review and catching that.
Otavio Salvador - Feb. 5, 2013, 6:40 p.m.
On Tue, Feb 5, 2013 at 3:44 PM, Darren Hart <dvhart@linux.intel.com> wrote:
>
>
> On 02/05/2013 08:36 AM, Enrico Scholz wrote:
>> Darren Hart <dvhart@linux.intel.com> writes:
>>
>>>>> +  $NC -X connect $*
>>>>
>>>> why '$*' but not '"$@*"'?
>>>>
>>> I'm not familiar with $@*
>>
>> sorry... I meant "$@"
>>
>>
>>> As for $* versus $@, the issue is how the arguments are presented. $*
>>> as a single word, $@ each argument is quoted separately. I believe I
>>> ran into issues with $@. I haven't had any trouble with $*.
>>
>> $* is causing trouble all the time because it does not retain whitespaces
>> or empty parameters.  There are only very few cases, where $* makes sense.
>>
>>
>>> Is there a particular use case where you can see this failing as is?
>>
>> "$@" is just the right thing to do in this situation.  E.g. when your
>> script is called as
>>
>> | oe-git-proxy.sh "${HOST}" "${PORT}"
>>
>> and HOST is undefined due to some reason, you will try to connect to
>> "${PORT}" with $*.  The "$@" will cause nc to complain about the broken
>> HOST parameter.
>>
>>
>> Btw...
>>
>> | exec $NC $METHOD "$@"
>>
>> would be the school book implementation for the thing you want to do...
>>
>>
>>
>> Enrico
>
> That all makes sense. When I read up the difference again in the bash
> documentation I was surprised I had used $*, but thought I had done that
> dance already. I'll update with "$@" and do some tests.
>
> Thank you for the review and catching that.

Please give it a try in dash as well.

--
Otavio Salvador                             O.S. Systems
E-mail: otavio@ossystems.com.br  http://www.ossystems.com.br
Mobile: +55 53 9981-7854              http://projetos.ossystems.com.br
Darren Hart - Feb. 5, 2013, 6:50 p.m.
On 02/05/2013 10:40 AM, Otavio Salvador wrote:
> On Tue, Feb 5, 2013 at 3:44 PM, Darren Hart <dvhart@linux.intel.com> wrote:
>>
>>
>> On 02/05/2013 08:36 AM, Enrico Scholz wrote:
>>> Darren Hart <dvhart@linux.intel.com> writes:
>>>
>>>>>> +  $NC -X connect $*
>>>>>
>>>>> why '$*' but not '"$@*"'?
>>>>>
>>>> I'm not familiar with $@*
>>>
>>> sorry... I meant "$@"
>>>
>>>
>>>> As for $* versus $@, the issue is how the arguments are presented. $*
>>>> as a single word, $@ each argument is quoted separately. I believe I
>>>> ran into issues with $@. I haven't had any trouble with $*.
>>>
>>> $* is causing trouble all the time because it does not retain whitespaces
>>> or empty parameters.  There are only very few cases, where $* makes sense.
>>>
>>>
>>>> Is there a particular use case where you can see this failing as is?
>>>
>>> "$@" is just the right thing to do in this situation.  E.g. when your
>>> script is called as
>>>
>>> | oe-git-proxy.sh "${HOST}" "${PORT}"
>>>
>>> and HOST is undefined due to some reason, you will try to connect to
>>> "${PORT}" with $*.  The "$@" will cause nc to complain about the broken
>>> HOST parameter.
>>>
>>>
>>> Btw...
>>>
>>> | exec $NC $METHOD "$@"
>>>
>>> would be the school book implementation for the thing you want to do...
>>>
>>>
>>>
>>> Enrico
>>
>> That all makes sense. When I read up the difference again in the bash
>> documentation I was surprised I had used $*, but thought I had done that
>> dance already. I'll update with "$@" and do some tests.
>>
>> Thank you for the review and catching that.
> 
> Please give it a try in dash as well.

$@ also works with dash. I have been testing in bash and dash throughout
development as well.
Enrico Scholz - Feb. 5, 2013, 7:08 p.m.
Otavio Salvador <otavio@ossystems.com.br> writes:

> Please give it a try in dash as well.

fwiw, what's the point in writing such scripts plain sh?  I guess. every
machine where this script is running has /bin/bash and performance is
not critical for it. Using '#! /bin/bash' shebang fixes the problem
where sh is non bash.

Some constructs in the scripts can be expressed mor efficiently in bash
(e.g. the '... | sed' statements, or using arrays for the arguments).


Enrico
Darren Hart - Feb. 5, 2013, 7:18 p.m.
On 02/05/2013 11:08 AM, Enrico Scholz wrote:
> Otavio Salvador <otavio@ossystems.com.br> writes:
> 
>> Please give it a try in dash as well.
> 
> fwiw, what's the point in writing such scripts plain sh?  I guess. every
> machine where this script is running has /bin/bash and performance is
> not critical for it. Using '#! /bin/bash' shebang fixes the problem
> where sh is non bash.
> 
> Some constructs in the scripts can be expressed mor efficiently in bash
> (e.g. the '... | sed' statements, or using arrays for the arguments).

Indeed, in fact I had to remove the bash parameter expansions
(substrings, etc) I had in my original script in order to make it
dash-able. I hate dash and personally find it to be completely
pointless, but for better or worse, people like it and there is an
expectation that scripts run in both dash and bash in OE-core.

If RP tells me a #!/bin/bash is acceptable, I'll restore the bashisms in
a heartbeat.
Otavio Salvador - Feb. 5, 2013, 7:29 p.m.
On Tue, Feb 5, 2013 at 5:18 PM, Darren Hart <dvhart@linux.intel.com> wrote:
>
>
> On 02/05/2013 11:08 AM, Enrico Scholz wrote:
>> Otavio Salvador <otavio@ossystems.com.br> writes:
>>
>>> Please give it a try in dash as well.
>>
>> fwiw, what's the point in writing such scripts plain sh?  I guess. every
>> machine where this script is running has /bin/bash and performance is
>> not critical for it. Using '#! /bin/bash' shebang fixes the problem
>> where sh is non bash.
>>
>> Some constructs in the scripts can be expressed mor efficiently in bash
>> (e.g. the '... | sed' statements, or using arrays for the arguments).
>
> Indeed, in fact I had to remove the bash parameter expansions
> (substrings, etc) I had in my original script in order to make it
> dash-able. I hate dash and personally find it to be completely
> pointless, but for better or worse, people like it and there is an
> expectation that scripts run in both dash and bash in OE-core.
>
> If RP tells me a #!/bin/bash is acceptable, I'll restore the bashisms in
> a heartbeat.

Please drop the .sh than; I'd promptly call it as:

sh oe-git-proxy.sh

And in my system /bin/sh is dash.

--
Otavio Salvador                             O.S. Systems
E-mail: otavio@ossystems.com.br  http://www.ossystems.com.br
Mobile: +55 53 9981-7854              http://projetos.ossystems.com.br
Richard Purdie - Feb. 5, 2013, 10:10 p.m.
On Tue, 2013-02-05 at 11:18 -0800, Darren Hart wrote:
> 
> On 02/05/2013 11:08 AM, Enrico Scholz wrote:
> > Otavio Salvador <otavio@ossystems.com.br> writes:
> > 
> >> Please give it a try in dash as well.
> > 
> > fwiw, what's the point in writing such scripts plain sh?  I guess. every
> > machine where this script is running has /bin/bash and performance is
> > not critical for it. Using '#! /bin/bash' shebang fixes the problem
> > where sh is non bash.
> > 
> > Some constructs in the scripts can be expressed mor efficiently in bash
> > (e.g. the '... | sed' statements, or using arrays for the arguments).
> 
> Indeed, in fact I had to remove the bash parameter expansions
> (substrings, etc) I had in my original script in order to make it
> dash-able. I hate dash and personally find it to be completely
> pointless, but for better or worse, people like it and there is an
> expectation that scripts run in both dash and bash in OE-core.
> 
> If RP tells me a #!/bin/bash is acceptable, I'll restore the bashisms in
> a heartbeat.

General shell code in metadata should run under /bin/sh. Its fine for an
external scripts to use bashisms as long as it says so at the top of the
file :)

Cheers,

Richard

Patch

diff --git a/scripts/oe-git-proxy.sh b/scripts/oe-git-proxy.sh
new file mode 100755
index 0000000..e12289d
--- /dev/null
+++ b/scripts/oe-git-proxy.sh
@@ -0,0 +1,125 @@ 
+#!/bin/sh
+
+# oe-git-proxy.sh is a simple tool to be via GIT_PROXY_COMMAND. It uses BSD netcat
+# to make SOCKS5 or HTTPS proxy connections. It uses ALL_PROXY to determine the
+# proxy server, protocol, and port. It uses NO_PROXY to skip using the proxy for
+# a comma delimited list of hosts, host globs (*.example.com), IPs, or CIDR masks
+# (192.168.1.0/24). It is known to work with both bash and dash shells.
+#
+# BSD netcat is provided by netcat-openbsd on Ubuntu and nc on Fedora.
+#
+# Example ALL_PROXY values:
+# ALL_PROXY=socks://socks.example.com:1080
+# ALL_PROXY=https://proxy.example.com:8080
+#
+# Copyright (c) 2013, Intel Corporation.
+# All rights reserved.
+#
+# AUTHORS
+# Darren Hart <dvhart@linux.intel.com>
+
+# Locate the netcat binary
+NC=$(which nc 2>/dev/null)
+if [ $? -ne 0 ]; then
+	echo "ERROR: nc binary not in PATH"
+	exit 1
+fi
+METHOD=""
+
+# Test for a valid IPV4 quad with optional bitmask
+valid_ipv4() {
+	echo $1 | egrep -q "^([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}(/(3[0-2]|[1-2]?[0-9]))?$"
+	return $?
+}
+
+# Convert an IPV4 address into a 32bit integer
+ipv4_val() {
+	SHIFT=24
+	VAL=0
+	for B in $(echo "$1" | sed -e 's/\./ /g'); do
+		VAL=$(($VAL+$(($B<<$SHIFT))))
+		SHIFT=$(($SHIFT-8))
+	done
+	echo "$VAL"
+}
+
+# Determine if two IPs are equivalent, or if the CIDR contains the IP
+match_ipv4() {
+	CIDR=$1
+	IP=$2
+
+	if [ -z "${IP%%$CIDR}" ]; then
+		return 0
+	fi
+
+	# Determine the mask bitlength
+	BITS=${CIDR##*/}
+	if [ -z "$BITS" ]; then
+		return 1
+	fi
+
+	IPVAL=$(ipv4_val $IP)
+	IP2VAL=$(ipv4_val ${CIDR%%/*})
+
+	# OR in the unmasked bits
+	for i in $(seq 0 $((32-$BITS))); do
+		IP2VAL=$(($IP2VAL|$((1<<$i))))
+		IPVAL=$(($IPVAL|$((1<<$i))))
+	done
+
+	echo "$IPVAL vs. $IP2VAL"
+	if [ $IPVAL -eq $IP2VAL ]; then
+		return 0
+	fi
+	return 1
+}
+
+# Test to see if GLOB matches HOST
+match_host() {
+	HOST=$1
+	GLOB=$2
+
+	if [ -z "${HOST%%$GLOB}" ]; then
+		return 0
+	fi
+
+	# Match by netmask
+	if valid_ipv4 $GLOB; then
+		HOST_IP=$(gethostip -d $HOST)
+		if valid_ipv4 $HOST_IP; then
+			match_ipv4 $GLOB $HOST_IP
+			if [ $? -eq 0 ]; then
+				return 0
+			fi
+		fi
+	fi
+
+	return 1
+}
+
+# If no proxy is set, just connect directly
+if [ -z "$ALL_PROXY" ]; then
+	$NC -X connect $*
+	exit
+fi
+
+# Connect directly to hosts in NO_PROXY
+for H in $(echo "$NO_PROXY" | sed -e 's/,/ /g'); do
+	if match_host $1 $H; then
+		METHOD="-X connect"
+		break
+	fi
+done
+
+if [ -z "$METHOD" ]; then
+	# strip the protocol and the trailing slash
+	PROTO=$(echo $ALL_PROXY | sed -e 's/\([^:]*\):\/\/.*/\1/')
+	PROXY=$(echo $ALL_PROXY | sed -e 's/.*:\/\/\([^:]*:[0-9]*\).*/\1/')
+	if [ "$PROTO" = "socks" ]; then
+		METHOD="-X 5 -x $PROXY"
+	elif [ "$PROTO" = "https" ]; then
+		METHOD="-X connect -x $PROXY"
+	fi
+fi
+
+$NC $METHOD $*