Python Pattern Module
The command line argument parser
I use the very friendly argumentparser lib for the command line arguments. I set them up and parse in a separate method.
from argparse import ArgumentParser # ... def parse_args(self, args=None): """Parse command line argumets.""" desc = "Write random numbers to a compressed csv file." parser = ArgumentParser(description=desc) parser.add_argument('--mini', '-m', type=int, default=100, help="Minimal number") parser.add_argument('--maxi', '-M', type=int, default=200, help="Maximal number") parser.add_argument('--numbers', '-n', type=int, default=100, help="Number of numbers") parser.add_argument('--drift', type=float, default=10.0, help="Drift") tgroup = parser.add_argument_group("Testing instead") tgroup.add_argument('--test', '-t', action="store_true", default=False, help="Perform doc tests and exit instead.") if args: self.args = parser.parse_args(args) else: self.args = parser.parse_args() self.logger.debug(self.args) return
The nice grouping and typical command line look and feel is excellent:
$ python template.py --help usage: template.py [-h] [--mini MINI] [--maxi MAXI] [--numbers NUMBERS] [--drift DRIFT] [--test] Write random numbers to a compressed csv file. optional arguments: -h, --help show this help message and exit --mini MINI, -m MINI Minimal number --maxi MAXI, -M MAXI Maximal number --numbers NUMBERS, -n NUMBERS Number of numbers --drift DRIFT Drift Testing instead: --test, -t Perform doc tests and exit instead.
The logging
I use the vanilla logging library in python - it's a bit hard to set up if you want custom formats (and different formats in a file and on the console). But regular formatting is no problem and rapidly done.
import logging # ... self.logger = logging.getLogger(name="Data") logging.basicConfig(level=logging.DEBUG) # ... self.logger.debug(self.args)
The output is something like:
DEBUG:Data:Ctor OK. DEBUG:Data:Setting up handle DEBUG:Data:Getting 100 numbers
A good idea might be to parse the arguments before setting up logging in case you want to control the log level(s) to show.
The doctest
At the start of the file
#!/usr/bin/python """ An example with the typical python ingredients I often use. These are the doctests for the module. We first set it up and create some comma separated values: >>> filename = '/tmp/mydata.csv' >>> dm = DataMaker() >>> dm.parse_args('--mini 100 --maxi 100 --drift 0 --numbers 3'.split(' ')) >>> handle = open(filename, 'w') >>> dm.setup_handle(handle) >>> dm.get_sample() >>> handle.flush() >>> handle.close() We now open the created file and store the contents in a list called lines >>> f = open(filename, 'r') >>> lines = list() >>> for line in f: lines.append(line) There are three lines plus a header >>> len(lines) == 4 True The first value is 100 >>> float(lines[1].split(',')[1].strip()) == 100 True The last value is 100 >>> float(lines[-1].split(',')[-1].strip()) == 100 True """
I'm not sure if it is a good idea - but I use an argument to start the tests if that is what the user wants.
if dm.args.test: import doctest res = doctest.testmod() print("Tested %s cases, %s failed." % (res.attempted, res.failed)) exit(0)
I am running it with the Python Code Coverage Module to also measure how effective my doctests are.:
~/tmp$ coverage run template.py --test [...] Tested 14 cases, 0 failed. ~/tmp$ coverage report -m Name Stmts Miss Cover Missing ---------------------------------------- template 65 5 92% 125-129
The csv file
I haven't used the csv lib much, but I'd like to start learning it - I tend to store huge amounts of csv files at work. But I have just used a regular handle and taken care of my semicolons and commas. Setting it up is pretty simple:
import csv #... self.writer = csv.writer(filehandle) self.write(('timestamp', 'left', 'middle', 'right')) #... self.writer.writerow([item for item in line]) #...
The gzip
Using gzip is in fact pretty simple in python:
import gzip handle = gzip.open('data.gzip', 'wb') _ = [handle.write("data: %s\n" % d) for d in xrange(8)] handle.flush() handle.close()
And I have discovered that zcat, for me, is almost as nice as the gunzip command:
$ zcat data.gzip data: 0 data: 1 data: 2 data: 3 data: 4 data: 5 data: 6 data: 7
Running it
Run the script with some arguments
$ python template.py --mini 10 --maxi 50 --numbers 20 --drift 1 DEBUG:Data:Namespace(drift=1.0, maxi=50, mini=10, numbers=20, test=False) DEBUG:Data:Ctor OK. DEBUG:Data:Setting up handle DEBUG:Data:Getting 20 numbers Uncompress the file $ gunzip -v data.csv.gz gzip: data.csv already exists; do you wish to overwrite (y or n)? y data.csv.gz: 57.3% -- replaced with data.csv View and plot with libre office: $ libreoffice data.csv
The complete recipe for My Python Pattern
#!/usr/bin/python """ An example with the typical python ingredients I often use. These are the doctests for the module. We first set it up and create some comma separated values: >>> filename = '/tmp/mydata.csv' >>> dm = DataMaker() >>> dm.parse_args('--mini 100 --maxi 100 --drift 0 --numbers 3'.split(' ')) >>> handle = open(filename, 'w') >>> dm.setup_handle(handle) >>> dm.get_sample() >>> handle.flush() >>> handle.close() We now open the created file and store the contents in a list called lines >>> f = open(filename, 'r') >>> lines = list() >>> for line in f: lines.append(line) There are three lines plus a header >>> len(lines) == 4 True The first value is 100 >>> float(lines[1].split(',')[1].strip()) == 100 True The last value is 100 >>> float(lines[-1].split(',')[-1].strip()) == 100 True """ import logging import csv from argparse import ArgumentParser import datetime import gzip from random import uniform class DataMaker(object): """Class that spits out some data in a csv format.""" def __init__(self): """Ctor takes a file handle on which we write""" self.writer = None self.args = None self.logger = logging.getLogger(name="Data") logging.basicConfig(level=logging.DEBUG) self.parse_args() self.logger.debug("Ctor OK.") return def setup_handle(self, filehandle): """Setup the file handle""" self.logger.debug("Setting up handle") self.writer = csv.writer(filehandle) self.write(('timestamp', 'left', 'middle', 'right')) return def parse_args(self, args=None): """Parse command line argumets.""" desc = "Write random numbers to a compressed csv file." parser = ArgumentParser(description=desc) parser.add_argument('--mini', '-m', type=int, default=100, help="Minimal number") parser.add_argument('--maxi', '-M', type=int, default=200, help="Maximal number") parser.add_argument('--numbers', '-n', type=int, default=100, help="Number of numbers") parser.add_argument('--drift', type=float, default=10.0, help="Drift") tgroup = parser.add_argument_group("Testing instead") tgroup.add_argument('--test', '-t', action="store_true", default=False, help="Perform doc tests and exit instead.") if args: self.args = parser.parse_args(args) else: self.args = parser.parse_args() self.logger.debug(self.args) return def get_sample(self): """Get sample based on arguments""" self.logger.debug("Getting %s numbers" % self.args.numbers) for drift in xrange(self.args.numbers): mini = self.args.mini + drift*self.args.drift maxi = self.args.maxi + drift*self.args.drift rands = [uniform(mini, maxi), uniform(mini, maxi), uniform(mini, maxi)] rands = sorted(rands) self.store(rands[0], rands[1], rands[2]) return def store(self, left, mid, right): """Write the values with a timestamp""" now = datetime.datetime.now() self.write([str(now), left, mid, right]) return def write(self, line): """Write a line""" # this does not work in python 3 self.writer.writerow([item for item in line]) return if __name__ == "__main__": dm = DataMaker() if dm.args.test: import doctest res = doctest.testmod() print("Tested %s cases, %s failed." % (res.attempted, res.failed)) exit(0) handle = gzip.open("data.csv.gz", "wb") dm.setup_handle(handle) dm.get_sample() handle.flush() handle.close()
Related entries in Min Blogg:
- Python Pattern Doctest
- Python Doctest And Docstring
- Python Code Coverage Module
- Python Command Line Arguments, where I use an other argument parser (option parser in optparse).
- Python Compressed Files, where I investigate how efficient compression you get depending on your patterns in the file.
See also the Standard Python Library documentation:
- argparse - Parser for command-line options, arguments and sub-commands [1]
- csv - CSV File Reading and Writing [2]
- doctest - Test interactive Python examples [3]
- gzip - Support for gzip files [4]
- logging - Logging facility for Python (Make sure you read the three tutorials if you want more than plain vanilla logging - it rapidly gets complicated) [5]
See also Doug Hellman's module of the week:
Belongs in Kategori Test
Belongs in Kategori Mallar
Belongs in Kategori Programmering