Python
Ipython Now JupyterVpython
Pandas notebook
Convert dictionary entries into variables
[+]
See: Stackoverflow help page
import imp try: imp.find_module('pandas') import pandas as pd pandas=True except ImportError: print 'Install pandas to have method 3' pandas=False class Bunch(object): def __init__(self, adict): self.__dict__.update(adict) datapar={"MH0": 53., "MH":125.3, "MA0":120., "MH1":130., "La2":0.01, "LaL":0.001, "Mtr01": 91., "Mtr02": 92., "Mtr03": 93,"Mtrch1": 91.8, "Mtrch2": 92.8, "Mtrch3": 93.8, "y11R": 0.0, "y12R": 0.0, "y13R": 0.0, "y21R": 0.0, "y22R": 0.0, "y23R": 0.0, "y31R": 0.0, "y32R": 0.0, "y33R": 0.0, "y11I": 0.0, "y12I": 0.0, "y13I": 0.0, "y21I": 0.0, "y22I": 0.0, "y23I": 0.0, "y31I": 0.0, "y32I": 0.0, "y33I": 0.0} SM={'vev':246.0,'alpha_em':1.0/128.,'G_F':1.166e-5,'m_e':0.000511,'m_mu':0.1057,'m_tau':1.777} #Method 1: definiing name spaces sm=Bunch(SM) dp=Bunch(datapar) print 'Method 1, namespace:',sm.m_e,dp.MH1 #Method 2: using the names directly for key,val in SM.items(): exec(key + '=val') for key,val in datapar.items(): exec(key + '=val') print 'Method 2, directly:',m_e,MH1 #Method 3: Pandas: keep both dictionaries and variables if pandas: ds=pd.Series(datapar) ds.MH1=140. print 'Method 3, Pandas, keep both var and key:',ds.MH1,ds['MH1']
What exactly does *args and **kwargs mean?
[+]
See http://stackoverflow.com for details:
you can call functions with *mylist and **mydict to unpack positional and keyword arguments:Will print:
Example: Suppose there exists some function stu.stu
you can call functions with *mylist and **mydict to unpack positional and keyword arguments:
def foo(a, b, c, d): print a, b, c, d l = [0, 1] d = {"d":3, "c":2} foo(*l, **d)
0 1 2 3
Example: Suppose there exists some function stu.stu
>>> def testd(stu_func,*args,**kwargs): >>> print 'len',len(args) >>> for arg in args: >>> print arg >>> if kwargs is not None: >>> for key, value in kwargs.iteritems(): >>> print "%s == %s" %(key,value) >>> if 'vev' in kwargs.keys(): >>> print 'vev found' >>> newargs=[10,20,0.1,2] >>> a=stu_func(*newargs,**kwargs) >>> print a >>> >>> testd(stu.stu,[MN],[MDF],[np.sqrt(lu**2+ld**2)],[lu/ld],vev=246.2) ---------- len 4 [30.0] [100.0] [1.5033296378372907] [0.06666666666666667] vev == 246.2 vev found 0.000884960384761
Professional way to pass functions with positional and keyword arguments:
[+]
See previous explanation and example
>>> def testd(stu_func,args=[],kwargs={}): >>> print 'len',len(args) >>> for arg in args: >>> print arg >>> if kwargs is not None: >>> for key, value in kwargs.iteritems(): >>> print "%s == %s" %(key,value) >>> if 'vev' in kwargs.keys(): >>> print 'vev found' >>> newargs=np.array([10,20,0.1,2]) >>> a=stu_func(*newargs,**kwargs) >>> print a >>> >>> testd(stu.stu,args=[[MN],[MDF],[np.sqrt(lu**2+ld**2)],[lu/ld]],kwargs={'vev':246.2}) --------------------- len 4 [30.0] [100.0] [1.5033296378372907] [0.06666666666666667] vev == 246.2 vev found 0.000884960384761
Matplotlib
[+]Numpy/Scipy
[+]
Tips and code with Numpy
Some txt file has the following structure
#var1 var2 var3
0.0 1.3 2.8
3.2 1.5 3.2
. . .
The following program could be used to analyse the data
See module here
Convert small numbers in zeros
[+]np.set_printoptions(suppress=True)
Arrays manipulation
[+]Reverse any array
[+]>>> x[::-1]
Data mining with Numpy
[+]Some txt file has the following structure
#var1 var2 var3
0.0 1.3 2.8
3.2 1.5 3.2
. . .
The following program could be used to analyse the data
from pylab import * dic={} x=np.loadtxt('data.txt') dic['var1']=x[:,0] dic['var2']=x[:,1] dic['var3']=x[:,2] #then you can apply masks y=x[sin(dic['var1'])==0]
See module here
Copy array
[+]
Problem: In Python a new assginment is just a reference to the old structure:
To get
>>> import numpy as np >>> a=np.array([1,2,3]) >>> b=a >>> b[0]=5 >>> print a array([5,2,3])
To get
b
as an indepent copy of a
- Copy array
b
from arraya
in new addressb=np.zeros(a.shape) b[:]=a
- Use
copy
module>>> import copy >>> a=np.array([1,2,3]) >>> b=copy.deepcopy(a) >>> b[0]=5 >>> print a array([1,2,3])
Exponential (and Polynomial) fit
[+]fmin_powell with extra arguments
[+]import scipy.optimize def relicdensity(x,mH,mh,mA,mE1,l2): datos= open('data0.par','w') datapar={"mmh": 126., "mmH0":63.54, "mmA0":166.16, "mmHch":73.78, "lamL":-0.00329, "lam2":0.000567} datapar['lamL']=x datapar['mmH0']=mH datapar['mmh']=mh datapar['mmA0']=mA datapar['mmHch']=mE1 datapar['lam2']=l2 writeinputf('data0.par',datapar) mo = commands.getoutput('../micromegas/Doblete_Inerte/main data0.par ') Omega=eval(mo.split('Omega=')[1].split('\n')[0]) print 'Omega interno',Omega return Omega def fmino(xarr,mH,mh,mA,mE1,l2): Omega=relicdensity(xarr[0],mH,mh,mA,mE1,l2) OmegaExp=0.11 return np.abs((Omega-OmegaExp)/OmegaExp) def optimo(x,mH,mh,mA,mE1,l2): '''looking for the minimum''' xarr=np.asarray([x]) x0=scipy.optimize.fmin_powell(fmino,xarr,args=(mH,mh,mA,mE1,l2),xtol=1E-3,ftol=1E-3) return x0
convert scalar function to vectorial function
[+]
Suppose you have a scalar function f(x,y,a=2), instead of just
To convert in a vectorial function for x,y, e.g, accepting arrays x and y, use
>>> map(f,x,y)
To convert in a vectorial function for x,y, e.g, accepting arrays x and y, use
import numpy as np vf=np.vectorize(f,excluded={'a':2}) #Then >>> vf([1,2],[3,4]) ---- np.array([0.1,04])
String name and variable name conversions
[+]Convert string to variable name
[+]>>> exec('var=2') >>> print(var)
Convert variable name to string
[+]def name(**variables): return [x for x in variables][0] var=2 print(name(var=var)) var
setup.py
[+]
Install with
Uninstall with
sudo python setup.py install --record install.record
Uninstall with
sudo rm $(cat install.record)
Dictionaries
[+]
To make a new ordered dictionary from the original, sorting by the values:The OrderedDict? behaves like a normal dict. See here
>>> # regular unsorted dictionary >>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2} >>> # dictionary sorted by key >>> OrderedDict(sorted(d.items(), key=lambda t: t[0])) OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)]) >>> # dictionary sorted by value >>> OrderedDict(sorted(d.items(), key=lambda t: t[1])) OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)]) >>> # dictionary sorted by length of the key string >>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0]))) OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])
Shell
[+]Tips and tricks
[+]- How to downcase the first character of a string in Python?. Example:
funcu = lambda s: s[:1].upper() + s[1:] if s else '' funcl = lambda s: s[:1].lower() + s[1:] if s else '' def upperonlyfirst(s): ss=[] for i in s.lower().split(): if len(i)>3: ss.append(funcu(i)) else: ss.append(i) return ' '.join(ss) s='JOURNAL OF COSMOLOGY AND ASTROPARTICLE PHYSICS' >>> upperonlyfirst(s) >>> 'Journal of Cosmology and Astroparticle Physics'
Beautiful Soup
[+]
To extract one specific html tag with a class attribute, like the one in the following soup=BeautifulSoup(samplehtml)
Use eitheror
<strong> <div class="record_body"> <a class = "titlelink" href= "/record/903906"> Antioquia U. </a> </strong> <br /> <small> Universidad de Antioquia, Instituto de Fisica, Grupo de Fenomenologia, de Interacciones Fundamentales, P.O. Box 1226, Medellin, COLOMBIA<br /> <a href="http://fisica.udea.edu.co/">http://fisica.udea.edu.co/</a><br /> <span class='moreinfo'><a href="/search?p=%22Antioquia%20U.%22&f=affiliation">158 Papers from Antioquia U.</a></span> </small> </div>
Use either
soup.findAll("div", { "class" : "record_body" })
soup.findAll("a", { "class" : "titlelink" })
Pandas
[+]Intialize pandas with a list of dictionaries
[+]import pandas as pd x=pd.DataFrame([{'a':1,'b':2},{'a':3,'b':4}])
how to read file with space(s) separated values
[+]
See: http://stackoverflow.com/questions/19632075/how-to-read-file-with-space-separated-values
Fast way:
use delimiter : string, default None. Alternative argument name for sep. Regular expressions are accepted.
$cat kk.csv a b c 1 23 9
Fast way:
>>> pd.read_csv("kk.csv",delim_whitespace=True)
use delimiter : string, default None. Alternative argument name for sep. Regular expressions are accepted.
>>> pd.read_csv("kk.csv",delimiter=r"\s+")
| | a | b | c | |0 | 1 | 23 | 9 |
Recommended main program
[+]xd=pd.DataFrame() ins=pd.Series({'A':1,'B':2}) for i in range(2): output_data=pd.Series({'C':ins.A+ins.B+i,'D':ins.A-ins.B+i}) xs=ins.append(output_data) xd=xd.append(xs.to_dict(),ignore_index=True) xd.to_csv('data.csv',index=False) #load as #xd=pd.read_csv('data.csv')
Display pandas in some php webpage
[+]
See: php
Pandas configurations
[+]
Increase the pretty printed rows and columns
Increase the number of characters of the output cell:
pd.set_option('display.max_rows', 500) pd.set_option('display.max_columns', 500)
Increase the number of characters of the output cell:
pd.set_option('display.max_colwidth',200)
Read complex numbers in pandas
[+]Drop columns from dataframe
[+]df=df.drop('column_name',axis=1)
Drop rows only considering certain columns for identifying duplicates
[+]
By default use all of the columns: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html
df.drop_duplicates(subset=['column'])
Rename columns of a DataFrame
[+]df.rename_axis({"A": "ehh", "C": "see"}, axis="columns")
Split a column of a Pandas DataFrame and add new columns
[+]
Consider a DataFrame df, with a column with format I want a new column with just the format A,B:
A,B;C,D;E,F;...
de['new_column']=df['column'].str.split(';').str[1].fillna('')
Rounding problems
[+]decimal module
[+]>>> a=1E-11 >>> b=11000 >>> c=-a-b #Right result >>> 1.0001*a+(b+c) 9.060529822707175e-13 #Wrong result! >>> 1.0001*a+b+c 0.0 #Fix de problem: >>> from decimal import Decimal >>> float(Decimal(1.0001*a)+Decimal(b)+Decimal(c)) 9.060529822707176e-13
Classes
[+]
A Guide to Python's Magic Methods: http://www.rafekettler.com/magicmethods.html
see: https://github.com/rescolo/getinspire/blob/master/getinspire/getinspire
USAGE:
Inherited class variable modification in Python
About `super`: What is the correct way to extend a parent class method in modern Python
You use super:
Example add method has_key('k') to python 3 dict:
No label specified
In other words, a call to super returns a fake object which delegates attribute lookups to classes above you in the inheritance chain. Points to note:
This does not work with old-style classes — so if you are using Python 2.x, you need to ensure that the top class in your hierarchy inherits from object.
You need to pass your own class and instance to super in Python 2.x. This requirement was waived in 3.x.
This will handle all multiple inheritance correctly. (When you have a multiple inheritance tree in Python, a method resolution order is generated and the lookups go through parent classes in this order.)
Take care: there are many places to get confused about multiple inheritance in Python. You might want to read super() Considered Harmful. If you are sure that you are going to stick to a single inheritance tree, and that you are not going to change the names of classes in said tree, you can hardcode the class names as you do above and everything will work fine.
Example:
class A(object): class A(object): ''' object allow to pass __init__ to child objects >>> a = A() init >>> a() call ''' classvar='Allows to use A.classvar directly! (without instance)' def __init__(self): self.var='value' #Only defined after instance is created; i=A() -> A.var print("init...") def __call__(self,x): print("call:") return x def __setitem__(self,k,v): '''See: http://www.diveintopython.net/object_oriented_framework/special_class_methods.html''' print({k:v})
USAGE:
o=A() print o(1.1) o['a']=1 >>>>OUTPUT<<<<< init... call: 1.1 {'a': 1}
Inherited class variable modification in Python
class Parent(object): classfoobar='Hello' def __init__(self): self.foobar = ' world' class Child(Parent): classfoobar = Parent.classfoobar + ' cruel' def __init__(self): super(Child, self).__init__() self.foobar=self.classfoobar+self.foobar
>>> Parent.classfoobar 'Hello' >>> P=Parent() >>> P.foobar ' world' >>> Child.classfoobar 'Hello cruel' >>> C=Child() >>> C.foobar 'Hello cruel world'
About `super`: What is the correct way to extend a parent class method in modern Python
You use super:
Example add method has_key('k') to python 3 dict:
class hkdict(dict): def __init__(self,*args, **kwargs): super(hkdict, self).__init__(*args, **kwargs) def has_key(self,k): return k in self #============ >>> hkdict(python3_dict).has_key('hola mundo')
No label specified
In other words, a call to super returns a fake object which delegates attribute lookups to classes above you in the inheritance chain. Points to note:
This does not work with old-style classes — so if you are using Python 2.x, you need to ensure that the top class in your hierarchy inherits from object.
You need to pass your own class and instance to super in Python 2.x. This requirement was waived in 3.x.
This will handle all multiple inheritance correctly. (When you have a multiple inheritance tree in Python, a method resolution order is generated and the lookups go through parent classes in this order.)
Take care: there are many places to get confused about multiple inheritance in Python. You might want to read super() Considered Harmful. If you are sure that you are going to stick to a single inheritance tree, and that you are not going to change the names of classes in said tree, you can hardcode the class names as you do above and everything will work fine.
Example:
class Person(object): def greet(self): print "Hello" class Waiter(Person): def greet(self): super(Waiter,self).greet() print "Would you like fries with that?"
pyslha
[+]# pip install pyslha
Last version pyslha 3.1.1. To intialize one input file
#The order matters spcskel=''' block modsel block smin{CODE(colors="python")}def grep(pattern,multilinestring): '''Grep replacement in python as in: $ echo $multilinestring | grep pattern dev: re.M is for multiline strings ''' import re grp=re.finditer('(.*)%s(.*)' %pattern, multilinestring,re.M) return '\n'.join([g.group(0) for g in grp]) multilinestring='From Here\nto Eternity\nto Infinity' print grep('^to',multilinestring) ======= to Eternity to Infinity{CODE}puts block minpar block sphenoinput ''' LHA=pyslha.readSLHA(spcskel,ignorenobr=True,ignorenomass=True).blocks
Regex
[+]- Python implementation of grep with re (TODO: option -v)
def grep(pattern,multilinestring): '''Grep replacement in python as in: $ echo $multilinestring | grep pattern dev: re.M is for multiline strings ''' import re grp=re.finditer('(.*)%s(.*)' %pattern, multilinestring,re.M) return '\n'.join([g.group(0) for g in grp]) multilinestring='From Here\nto Eternity\nto Infinity' print grep('^to',multilinestring) >OUPTUT to Eternity to Infinity
Import modules from other directories
[+]Import module from parent directory
[+]import sys from pathlib import Path cmd_folder=Path(Path.cwd()).parent.as_posix() if cmd_folder not in sys.path: sys.path.insert(0, cmd_folder)
import your_module
Commands vs subprocess
[+]commands.getoutput('command --options args')
subprocess.Popen('command --options args',shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE).communicate()
subprocess.Popen('ls file*',shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE).communicate()
However, the use of `shell` option is considered insecure and in absence of this, the full command must be passed as list, e.g:
subprocess.Popen('command --options args'.split(),stdout=subprocess.PIPE,stderr=subprocess.PIPE).communicate()
One convenient implementation inside of a Python 3 code could be:
class subproccess_to_commands(object): ''' Compatibility class for python 2 commands module intialize as: commands=subproccess_to_commands() USAGE after inizialization: commands.getoutput("command") ''' import subprocess def getoutput(self,*args,**kargs): o,e=subprocess.Popen(*args,shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE,**kargs).communicate() if o: return o.decode('utf8').strip() else: return e.decode('utf8').strip()
To get output line by line in python 3 (see https://stackoverflow.com/a/17698359/2268280 and https://stackoverflow.com/a/28319191/2268280 )
def execute(*args,**kwargs): kwargs['stdout']=subprocess.PIPE kwargs['bufsize']=1 kwargs['universal_newlines']=True with subprocess.Popen(*args,**kwargs) as p: for line in p.stdout: print(line, end='') # process line here if p.returncode != 0: raise subprocess.CalledProcessError(p.returncode, p.args)
execute('for i in $(seq 1 3);do echo $i; sleep 1;done',shell=True)
Compatibility Python 2.x Python 3.x
[+]
Use `print` as a function and use the Python 3.x standards which are compatible with Python 2.x. If not possible use compatibility libraries like `future` or `six`, or creates your own ones. As an example, check this commit in GitHub
Testing in Python
[+]
See https://krother.gitbooks.io/python-testing-tutorial/content/
Recommended framework in python is nose
The assertions are checked automatically for all test functions with the shell command:
Recommended framework in python is nose
pip3 install nose
nosetests example_test.py
Example
[+]from nose.tools import assert_equal def test_example(): assert_equal(1 + 1, 2)
nosetests only_test_this.py