python
自动补全
(linux)
vim内实现:(此方案来自网址)
wget https://github.com/rkulla/pydiction/archive/master.zip
unzip -q master
mv pydiction-master pydiction
mkdir -p ~/.vim/tools/pydiction
cp -r pydiction/after ~/.vim
cp pydiction/complete-dict ~/.vim/tools/pydiction
vim ~/.vimrc
filetype plugin on
let g:pydiction_location = '~/.vim/tools/pydiction/complete-dict'
- terminal(交互模式下)
名字随意
#!/usr/bin/python
# python startup file
import sys
import readline
import rlcompleter
import atexit
import os
readline.parse_and_bind('tab: complete')
histfile = os.path.join(os.environ['HOME'], '.pythonhistory')
try:
readline.read_history_file(histfile)
except IOError:
pass
atexit.register(readline.write_history_file, histfile)
del os, histfile, readline, rlcompleter
查询本机python库(import sys; sys.path)
cp xx.py lib_path
加入到bashrc:`export PYTHONSTARTUP=lib_path/xx.py
DONE
python version error
To begin with,Python now has two version,2 or 3.
Although official is recommending coders to use Python3.But the transmit the 2 code to 3
will cost too much,at the same time.The version 3 is no longer be compatible with 2.
yum's python version error
Until 17.04.29,My linux’s yum still use the Python2…
if u had installed some Python3 ,and caused some error.
It might be the confusion of version,
find your app config,such as yum
vim /usr/bin/yum
#!/usr/bin/yum2.7
vim /usr/libexec/urlgrabber-ext-down
#!/usr/bin/yum2.7
If your app needs Python3,first download and make the configure file.
(By the way,seems the pyenv can achieve auto-change Python)
python request
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
def saveFile(text):
save_path = 'grab1.out'
f_obj = open(save_path, 'wb')
f_obj.write(text)
f_obj.close()
saveFile(data)
- requests
import requests r = requests.get('https://stackoverflow.com/users') # print(r.text) def saveText(text): save_path = 'stackoverflowUsers.out' f_obj = open(save_path, 'wb') f_obj.write(text) f_obj.close() rData=r.text.encode('UTF-8') saveText(rData)
-worm
``` python
import re
from urllib.request import urlretrieve
from worm.fileio import save_text
with open('.\\graber\\users.out', 'r', encoding='UTF-8') as users_obj:
# print(users_obj)
html_list = users_obj.readlines()
# print(type(html_str))
html_str = "".join(html_list)
# print(type(html_str))
# print(html_str)
image_matches = re.findall('<img src="(https://www.gravatar.*)" alt', html_str, re.M | re.I)
index = 0
for image_match in image_matches:
index += 1
print(image_match)
urlretrieve(image_match, './graber/image/user' + str(index) + '.jpg')
# image_urls_text = '''stackoverflow's users' image url:\n '''
# for image_match in image_matches:
# image_urls_text = image_urls_text + image_match +'\n'
# print(image_urls_text)
# save_text(image_urls_text, './graber/imageURL.txt')
# savefile(image_text.encode('UTF-8'),'image_url')
```
df:
pd.read_rsv('xxx')
fill na_values:
sentinels={'supple':['a','b']} pd.read_rsv('xx',na_values=sentinels)
chunker(read file to chunks)
chunker=pd.read_table('xxx',chunkSize=xx)
soup:
soup=BeautifulSoup(string,"html5lib")
merge:
pd.merge(df1,df2,on="xx")
- stack() row->colum
- str.UPPER
- cat
bins=[0,10,20]
cats=pd.cut(data,bins)
- 图像
img=cv2.imgread('xsx',0) #grey
f=np.fft.fft2(img) #fourier
fshift=np.fft.fft2shift(f)
s=np.log(np.abs(fshift)) #实数
plt.subplot(211),plt.imshow(f,'gray') #211--2*1 first
.hist 竖状图
.plot(xxx,’ro–’) #r–red; o–marker; ‘–’ mean 虚线
fig,axes=plt.subplot(2,1)
stacked=True
.plot(kind=’kde’) #密度曲线
.scatter(column)
.scatter_matrix(df)
.describe() # get std, mean which you can use to analyze the normal range of data
抛弃重复
.drop_duplicate(column)
ix[] deprecated[ 改为.loc和.iloc]
.loc –label
.iloc – positional.loc[i,x.split('|')] --i index, 再执行x.split()
set
set.union(*cat_sets)
【配置: Python3 + re + requests + Chrome】
在imooc上学习的,写时最好分下模块,便于自己思路整理和编写
具体代码github上
python_grab
以下是介绍一些注意事项:
打日志时,可以用
datetime.datetime.now().strftime("%Y====")
为/item/dwdede这种只有一半的链接补充头时,可用enumerate来解决
for i,item in enumerate(your_list):
item='dwdwdwdw'+item
your_list[i]=item
关于URI={URL,URN} URN–只命名不标记
若是简单的字符还是可以使用re来进行使用regex,但是若是很复杂的html标签之类的不太建议使用regex,一是调试时间长,而且复用性差。可用beautifulsoul来代替。
当然若是执意要用re,这里可给出一些例子,提供你参考。
- regex
- re.match() return tuple 如果要使用list,用
[i for i in tuple_name]
- re.search() 比match()好一点,因为match()限定了string的开头开始匹配,大部分错误都是这开始的
- re.findall() 这个比较推荐,首先是返回list,好操作,然后不局限于开头这一段
- re.DOTALL 使点也能匹配newline, re.VERBOSE 忽略空格和行符 ,re.MULTIILINE 使多行匹配
- * ? + 都是greedy, 关掉greedy模式,使得最短匹配,使用??、+?、*?
- (?=xxx)xx 环视,检索xxx开头的,再以xx开始匹配.
- (?<=xxx)xx 检索xxx开头,再跳过xx开始匹配
- or | 或
- re.match() return tuple 如果要使用list,用