A+ A A-

File I/O: Reading ASCII data

This tutorial:

File I/O: Reading ASCII data

Plotting: time-series

File I/O: Reading binary data

Plotting: Maps

 

In this tutorial we are going to try and read an ASCII file which contains concentration of C02 measured at cape Point between the 1 January 1995 and the 31st December 2007.
You can download the file here: CPT_CO2_dm_95_07.txt

The first few lines look like this:


Date All data Filtd data
01-Jan-95 #N/A #N/A
02-Jan-95 #N/A #N/A
03-Jan-95 #N/A #N/A
04-Jan-95 #N/A #N/A
15-Jan-95 357.98 357.63
16-Jan-95 357.89 357.71
...

So there is 1 header line. The rest consists of 3 colums seperated by tabs, with bad values assigned the #N/A character. First let's start a python session and open our data file. Then within the python interpreter we define our path and file names.

pname = ''
fname = 'CPT_CO2_dm_95_07.txt'

Then we open the file and assign a file pointer called fid to our file.

fid = open(pname+fname)

The output of fid on the prompt will give something like:
open file 'CPT_CO2_dm_95_07.txt', mode 'r' at 0xb77c3128

Now, I am going to use the readline and readlines method to read my data line by line.

# Read 1 line in header
header=fid.readline()

This only read 1 line (in this case, the first line) into a variable called header.

The output of header will give me a string variable:
'Date\tAll data\tFiltd data\r\n'

Then read all data lines in one go using the readlines method

data = fid.readlines()

All my data is now stored in a list variable called data. I can check the number of rows using the len method.

len(data)
4748

Now I can go through each line to extract the variable of interest. If I just print out my 1st data row I see that each row is finished with a \r\n, which means a return and new line. I also see that each element in my rows is seperated by a tab \t

data[0]
'01-Jan-95\t#N/A\t#N/A\r\n'

I also want to replace all the bad values "#N/A" with a "NaN". To do that I use the replace method. Here is an example for the 1st row of data

row = data[0]
row = row.replace("#N/A","NaN")

Now I remove the return and new line characters from my row

row = row.strip('\r\n')

And then I seperate my 3 variables using the tab character

a,b,c=row.split('\t')

The 3 elements are now stored in string variables a, b and c. I can directly convert those strings to float using the float method.

b=float(b)
c=float(c)

For the time variable it is more difficult. In this example, I will convert my time variable from a string to a float. I will define my time as the number of days since 1-January-1950. This is done using methods in the datetime module of python called datetime and timedelta.

from datetime import datetime, timedelta

I convert my variable a to a datetime object

datetime.strptime(a, "%d-%b-%y")

and I then convert it to the number of days since 1-Jan-1950

(datetime.strptime(a, "%d-%b-%y")-datetime(1950,1,1)).days

Let us now put it all together in a loop.

#Initialise my variables as nans of length data
tserial = ones((len(data),))*NaN
allData = ones((len(data),))*NaN
filtData = ones((len(data),))*NaN
i=0
for row in data:

    row = row.replace("#N/A","NaN")
    row = row.strip('\r\n')
    a,b,c=row.split('\t')
    # Store time as days since 1-Jan-1950
    tserial[i]=(datetime.strptime(a, "%d-%b-%y")-datetime(1950,1,1)).days
    allData[i]= float(b)
    filtData[i] = float(c)
    i=i+1

You can download the example code from: getCo2.py file

Go to top of page

Last Updated on Monday, 06 August 2012 17:22

Hits: 247

Marjolaine Rouault's Tutorial

Last Updated on Monday, 30 July 2012 17:25

Hits: 168

CHPC Introductory scientic programming school

Nov 15, 12:26 PM
 

A funded full-week Introductory Scientific Programming School for Science and Engineering students who wish to advance their skills in Linux (Ubuntu) and Python Programming language.


27 Nov. – 04 Dec. 2011

Hosted by the Centre for High Performance Computing (CHPC) of the Council for Scientific and Industrial Research (CSIR) at Meraka Institute and funded by the Department of Science and Technology (DST).

Syllabus to be covered includes:
Full 2 day on introduction to Linux (Ubuntu) on the following topics:

Overview of Ubuntu Linux Desktop; Running commands and Getting Help; Browsing the file system; the bash shell; Standard I/O and Pipe; Users, Groups and Permissions; vi and vim Editor basics; the Linux Filesystem In-Depth; Advanced Topics in Users; Groups and Permissions; Printing; Introduction to String Processing; String Processing with Regular Expressions; Finding and Processing Files; and Investigating and Managing Processes.

Full 4 days on Introduction to Python Programming on the following topics:

Python basics, Python Objects, Numbers, Sequences, Dictionaries, Conditional and Loops, Files and Input/Output, Error and Exceptions.

Download the full application form in MS Word format.

Download the full program for the school here


CLOSING DATE FOR APPLICATIONS: 18h00 Sunday 6 November 2011


Should you wish to become one of the participants, please complete the following application form and email (with e-mail subject: CHPC Introductory Programming School) the document back to: This email address is being protected from spambots. You need JavaScript enabled to view it. before the closing date. Successful candidates will be notified from 11 November 2011.

Last Updated on Tuesday, 24 July 2012 18:17

Hits: 200

SIG Workshops

The CHPC actively partners with Special Interest Groups (SIGs) to address the needs of specific research domains.

Through collaboration with a wide range of stakeholders, the CHPC and its community are able to link up with technology missions in the National R&D Strategy in order to jointly address a wide range of notable challenges.

Stakeholders, general users and members of the CHPC have self-assembled into Special Interest Groups (SIGs) involving various research areas:
 

  •     Advanced Computer Engineering
  •     Astrophysics, Astronomy & Cosmology
  •     Bioinformatics & Epidemiology
  •     Chemistry, Biochemistry & Material Science
  •     Computational Earth Science
  •     Computational Engineering
  •     Computational Finance
  •     Computational Graphics & Visualisation
  •     Computer Science & High-end Computing Technology
  •     Computational Physics


Several multi-disciplinary and multi-institutional collaborations resulted from this partnership framework. SIG Workshop Sponsorships have enabled the successful hosting of several workshops by local and international experts in different regions of South Africa, involving not only academia, but also experts from industry and the public sector.

Last Updated on Tuesday, 21 August 2012 13:37

Hits: 610

Documentation for users:

CHPC Student Cluster Competition 2013

Tsessebe Cluster Available

Graphical Processing Unit Cluster Available

CHPC SAGrid Cluster Available

Dirisa Storage Unit Available

Social Share

FacebookTwitterGoogle BookmarksLinkedin

Website developed by Multidimensions