Pandas Notes

  • When using an arbitrary index through set_index function,
    • df.loc[number, :], number will need to be part of the arbitrary index set
    • df.iloc[0, :] will return the first row regardless of the index set
      • the catch here is that you cannot specify the column by the column name, but only by the column number
    • [not preferred] df.ix[number, :], number will need to be part of the arbitrary index set
    • For more details:
      http://www.shanelynn.ie/select-pandas-dataframe-rows-and-columns-using-iloc-loc-and-ix/
  • When two dataframes share the same index (one is the subset of another), we can just assign columns to one another like below and the corresponding subset rows will be filled and the remaining will be NaN

 

Windows Python Development Environment Setup

  1. Install Powershell in Windows 7
  2. Install PyCharm Professional
  3. Install Python
  4. Install pip by downloading this script (https://raw.github.com/pypa/pip/master/contrib/get-pip.py) and running it using python
  5. Make sure the following directories are in the PATH environment variable:
    – C:\Python27\;C:\Python27\Scripts
  6. Inside the Windows PowerShell, run the following
    1. pip install virtualenv
    2. pip install requests[security]  – to eliminate the warning:
      c:\python27\lib\site-packages\pip\_vendor\requests\packages\urllib3\util\ssl_.py:90: InsecurePlatformWarning: A true SSL Context object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  7. Create a new virtualenv usingPyCharm by going into Project Interpreter
    1. add whatever packages needed using the same GUI
    2. install ipython
  8. copy tcl8.5 and tk8.5 from C:/Python2.7/tcl to C:\Users\Ben\Baseline\Lib
  9. now we should be able to plot from Python Console in PyCharm