安装paramiko报错解决 | 头脑的思考

最近通过pip安装一台机器的ansible，真是费劲了周折，总结如下，安装时报

‘ascii’ codec can’t decode byte 0xe2 in position 75: ordinal not in range(128)错误，我没特别管，安装上是装上了，但ansible运行不正常，我在python命令行下，导入paramiko模块，会报这个错误，

ImportError: No module named cryptography.hazmat.backends

明显表示没安装成功cryptography这个模块，然后通过pip install来安装，发现快结束时又报

‘ascii’ codec can’t decode byte 0xe2 in position 75: ordinal not in range(128)

解决如下：

export LC_ALL=C

pip install –upgrade setuptools

以上能解决上一个报错，但再运行，

pip install cryptography

会出现这个错误，distutils.errors.DistutilsError: Setup script exited with error: command ‘gcc’ failed with exit status 1

解决:

yum install gcc libffi-devel python-devel openssl-devel

然后，就没有问题了。

pip install cryptography

具体export LC_ALL=C含义如下：

LC_ALL is the environment variable that overrides all the other localisation settings (except $LANGUAGE under some circumstances).

Different aspects of localisations (like the thousand separator or decimal point character, character set, sorting order, month, day names, language or application messages like error messages, currency symbol) can be set using a few environment variables.

You’ll typically set $LANG to your preference with a value that identifies your region (like fr_CH.UTF-8if you’re in French speaking Switzerland, using UTF-8). The individual LC_xxx variables override a certain aspect. LC_ALL overrides them all. The locale command, when called without argument gives a summary of the current settings.

For instance, on a GNU system, I get:

$ locale LANG=en_GB.UTF-8 LANGUAGE= LC_CTYPE="en_GB.UTF-8" LC_NUMERIC="en_GB.UTF-8" LC_TIME="en_GB.UTF-8" LC_COLLATE="en_GB.UTF-8" LC_MONETARY="en_GB.UTF-8" LC_MESSAGES="en_GB.UTF-8" LC_PAPER="en_GB.UTF-8" LC_NAME="en_GB.UTF-8" LC_ADDRESS="en_GB.UTF-8" LC_TELEPHONE="en_GB.UTF-8" LC_MEASUREMENT="en_GB.UTF-8" LC_IDENTIFICATION="en_GB.UTF-8" LC_ALL=

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

$ locale
LANG=en_GB.UTF-8
LANGUAGE=
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=

I can override an individual setting with for instance:

$ LC_TIME=fr_FR.UTF-8 date jeudi 22 août 2013, 10:41:30 (UTC+0100)

1
2
3

$ LC_TIME=fr_FR.UTF-8 date
jeudi 22 août 2013, 10:41:30 (UTC+0100)

Or:

$ LC_MONETARY=fr_FR.UTF-8 locale currency_symbol €

1
2
3

$ LC_MONETARY=fr_FR.UTF-8 locale currency_symbol
€

Or override everything with LC_ALL.

$ LC_ALL=C LANG=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 cat / cat: /: Is a directory

1
2
3

$ LC_ALL=C LANG=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 cat /
cat: /: Is a directory

In a script, if you want to force a specific setting, as you don’t know what settings the user has forced (possibly LC_ALL as well), your best, safest and generally only option is to force LC_ALL.

The C locale is a special locale that is meant to be the simplest locale. You could also say that while the other locales are for humans, the C locale is for computers. In the C locale, characters are single bytes, the charset is ASCII (well, is not required to, but in practice will be in the systems most of us will ever get to use), the sorting order is based on the byte values, the language is usually US English (though for application messages (as opposed to things like month or day names or messages by system libraries), it’s at the discretion of the application author) and things like currency symbols are not defined.

On some systems, there’s a difference with the POSIX locale where for instance the sort order for non-ASCII characters is not defined.

You generally run a command with LC_ALL=C to avoid the user’s settings to interfere with your script. For instance, if you want [a-z] to match the 26 ASCII characters from a to z, you have to set LC_ALL=C.

On GNU systems, LC_ALL=C and LC_ALL=POSIX (or LC_MESSAGES=C|POSIX) override $LANGUAGE, while LC_ALL=anything-else wouldn’t.

A few cases where you typically need to set LC_ALL=C:

sort -u or sort ... | uniq.... In many locales other than C, on some systems (notably GNU ones), some characters have the same sorting order. sort -u doesn’t report unique lines, but one of each group of lines that have equal sorting order. So if you do want unique lines, you need a locale where characters are byte and all characters have different sorting order (which theC locale guarantees).

the same applies to the = operator of POSIX compliant expr or == operator of POSIX compliant awks (mawk and gawk are not POSIX in that regard), that don’t check whether two strings are identical but whether they sort the same.

Character ranges like in grep. If you mean to match a letter in the user’s language, use grep '[[:alpha:]]' and don’t modify LC_ALL. But if you want to match the a-zA-Z ASCII characters, you need either LC_ALL=C grep '[[:alpha:]]' or LC_ALL=C grep '[a-zA-Z]'. [a-z] matches the characters that sort after a and before z (though with many APIs it’s more complicated than that). In other locales, you generally don’t know what those are. For instance some locales ignore case for sorting so [a-z] in some APIs like bash patterns, could include [B-Z] or [A-Y]. In many UTF-8 locales (including en_US.UTF-8 on most systems), [a-z] will include the latin letters from a to y with diacritics but not those of z (since z sorts before them) which I can’t imagine would be what you want (why would you want to include é and not ź?).

floating point arithmetic in ksh93. ksh93 honours the decimal_point setting in LC_NUMERIC. If you write a script that contains a=$((1.2/7)), it will stop working when run by a user whose locale has comma as the decimal separator:

$ ksh93 -c 'echo $((1.1/2))' 0.55 $ LANG=fr_FR.UTF-8 ksh93 -c 'echo $((1.1/2))' ksh93: 1.1/2: arithmetic syntax error

1
2
3
4
5

$ ksh93 -c 'echo $((1.1/2))'
0.55
$ LANG=fr_FR.UTF-8 ksh93 -c 'echo $((1.1/2))'
ksh93: 1.1/2: arithmetic syntax error

Then you need things like:

#! /bin/ksh93 - float input="$1" # get it as input from the user in his locale float output arith() { typeset LC_ALL=C; (($@)); } arith output=input/1.2 # use the dot here as it will be interpreted # under LC_ALL=C echo "$output" # output in the user's locale

1
2
3
4
5
6
7
8

#! /bin/ksh93 -
float input="$1" # get it as input from the user in his locale
float output
arith() { typeset LC_ALL=C; (($@)); }
arith output=input/1.2 # use the dot here as it will be interpreted
# under LC_ALL=C
echo "$output" # output in the user's locale

As a side note: the , decimal separator conflicts with the , arithmetic operator which can cause even more confusion.

When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes. When dealing with data that is meant to be bytes, with text utilities, you’ll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

a corollary of the previous point: when processing text where you don’t know what character set the input is written in, but can assume it’s compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will no work if you’re in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That’s because . only matches characters and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

Any time where you process input data or output data that is not intended from/for a human. If you’re talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you’ll want to set LC_ALL=C:

$ printf '%g\n' 1e-2 0,01 $ LC_ALL=C printf '%g\n' 1e-2 0.01 $ date +%b août $ LC_ALL=C date +%b Aug

1
2
3
4
5
6
7
8
9

$ printf '%g\n' 1e-2
0,01
$ LC_ALL=C printf '%g\n' 1e-2
0.01
$ date +%b
août
$ LC_ALL=C date +%b
Aug

That also applies to things like case insensitive comparison (like in grep -i) and case conversion (awk‘s toupper(), dd conv=ucase…). For instance:

grep -i i

1
2

grep -i i

is not guaranteed to match on I in the user’s locale. In some Turkish locales for instance, it doesn’t as upper-case i is İ (note the dot) there and lower-case I is ı (note the missing dot).

2025年六月
M	T	W	T	F	S	S
« Jul
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30