Monday 17 August 2020

(practical-python->racket 1.7 functions)

I was expecting some difficulty in "translating" even the introductory section on functions from Python to Racket. I have only written basic functions with basic signatures in Racket so far. I have used functions which accept a variable number of parameters and those which have named parameters but I haven't write any as yet.

I was relieved when I went through the introduction to Functions in Practical Python Programming to find the it only introduced basic functions with a single parameter. It did also include error trapping and handling command line arguments so I did have some learning to do.

As calling functions recursively in the main looping paradigm in Racket, I've already written a number of functions in translating previous examples and exercises, mainly to replicate Python's for loops.

The section starts by introducing custom functions (i.e. functions defined in the program). Python Docstrings and how they would be displayed in the console by using the builtin help() function are already introduced at this early stage. As far as I know, Racket doesn't include such builtin function documentation out of the box.

The simple example of a Python function is:

def sumcount(n):

    '''

    Returns the sum of the first n integers

    '''

    total = 0

    while n > 0:

        total += n

        n -= 1

    return total

In my translation, I used an "inner" function to perform the loop:

; Calculate the sum of the first n integers

(define (sumcount n)

  (define (sc n total)

    (cond

      [(= n 0) total]

      [else (sc (- n 1) (+ total n))]))

  (sc n 0))

Whilst the Racket is the same number of lines of code as the Python, I do find it more difficult to take it in at a glance.

This is followed by two examples of Python library functions. From my point of view, both Python and Racket have three levels of function libraries - built-in functions, standard library functions and external library functions. The builtin and standard library functions are distributed as part of the language, external libraries have to be separately downloaded and installed. The distinction that in my mental model between builtin functions and library functions is that builtin functions can be used without the need to tell the language that you want to use them. For example, you don't need to import the Python print function.

(Actually, Racket consists of many different languages which will have different standard libraries and even different syntax. I am sticking to the core #lang racket language in this project.) 

The first Python library function mentioned is sqrt which is part of the Python math library:

import math

x = math.sqrt(10)

The Racket sqrt function is "builtin" and doesn't need to be required before it is used.

(sqrt x)

The second Python library function is request from the standard urllib:

import urllib.request

u = urllib.request.urlopen('http://www.python.org/')

data = u.read()

Racket also requires access to a standard library for HTTP support. The Racket standard library has a huge range of options for working with HTTP. (Probably even more than Python.) This is the neatest solution that I found:

(require net/url)

(call/input-url (string->url "https://racket-lang.org")

                  get-pure-port

                  port->string)

The introduction of errors in Practical Python Programming is based on attempting to convert a non-numeric string to an integer. Doing so causes Python to raise an exeception. Racket's equivalent, string->number, simply returns false in such cases. To emulate the Python example, I attempted to convert 'N/A.

Python console:

>>> int('N/A')

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ValueError: invalid literal for int() with base 10: 'N/A' 

Racket Console:

> (string->number 'N/A)

; string->number: contract violation

;   expected: string?

;   given: 'N/A

; [,bt for context]

Handling errors in Python and Racket seemed to be very similar to me. The main difference is the order of the code. In Python, error handling follows the "protected" code. In Racket, error handling precedes the "protected" code. I do find the Python code more easy to read at a glance though. (However from what I learnt from How To Design Programs, the readability of Racket programs can be greatly improved by wrapping code in well-named functions.)

Python

    try:

        shares = int(fields[1])

    except ValueError:

        print("Couldn't parse", line)

Racket:

 (with-handlers ([exn:fail:contract?

                  (lambda (err)

                  (display "Couldn't Parse ")

                  (displayln field))])

 (string->number field))

Raising errors is easy in both languages though in Python you must always specify the type of error (by providing an instance of the specific error). In Racket, you can either raise an unclassified error or a specific type of error. This latter requires calling a specific function such as raise-range error.

Python

    raise RuntimeError('What a kerfuffle')

The exercises start with revising how to define a simple function and calling it from the Python console. Here's my Racket version.

> (define (greeting name)

    (displayln (string-append "Hello " name)))

> (greeting "Guido")

Hello Guido

> (greeting "Paula")

Hello Paula

The remaining exercises involve converting the portfolio costing program from the files exercise into a function, adding error handling, using a library program to process the csv file and finally turning it into a command line program.

The first step is to take the function written during the exercises for 1.6 Files which totalled the cost of a stock portfolio stored in a csv file and turn it into a function. The original Python function used a for ... in loop which I translated into a Racket recursive function. So I wrapped the recursive function inside another function to complete the exercise:

; portfolio cost - 1st iteration - takes a filename parameter

(define (portfolio-cost filename)

  ; pcost - an internal function to perform the "loop"

  (define (pcost rows)

    (cond

      [(empty? rows) 0]

      [else (+ (* (string->number 

                    (second (string-split (first rows) ",")))

                  (string->number 

                    (third (string-split (first rows) ","))))

               (pcost (rest rows)))]))

  (define csv-rows (file->lines filename #:mode 'text))

  (pcost (rest csv-rows)))      

The python -i command line option is used to show how to load a file (a module in Python) and access Python interactively. I didn't find this so straight forward in Racket. I'm sure it is possible but it seems to require knowledge of Racket modules which I haven't looked into yet.

The next exercise is to add error handling to the function to ignore any lines where the cost coduln't be calculated. I wrapped the cost calculation in a call of the with-handlers function. It looks quite messy and I'd guess that the Racket Way would have been to isolate the calculation into a function to keep the main code cleaner:

; portfolio cost - 2nd iteration - add error handling

(define (portfolio-cost filename)

  ; pcost - an internal function to perform the "loop"

  (define (pcost rows)

    (cond

      [(empty? rows) 0]

      [else (+ (with-handlers ([exn:fail:contract?

                               (lambda (err)

                                 (display "Couldn't Parse ")

                                 (displayln (first rows))

                                 0)])

                 (* (string->number 

                      (second (string-split (first rows) ",")))

                    (string->number 

                      (third (string-split (first rows) ",")))))

               (pcost (rest rows)))]))

  (define csv-rows (file->lines filename #:mode 'text))

  (pcost (rest csv-rows)))

The next exercise was to use a library function to parse the csv data rather than doing it "by hand". There is a csv module in Python's standard library. There is a csv parsing module available for Racket. It is not included in the standard library but is "approved" in the sense that it is included in the official Racket documentation. It only took a couple of minutes to install using Dr Racket. It did also install a couple of other libaries on which it is dependent upon.

I used the csv library only to convert the csv to a list of lists. It took away the need to split each row of the csv data. (It also would have taken care of things like embedded quotes that string-split wouldn't have handled.)

; portfolio cost - 3rd iteration - use a csv library

(require csv-reading)

(define (portfolio-cost filename)

  ; pcost - an internal function to perform the "loop"

  (define (pcost rows)

    (cond

      [(empty? rows) 0]

      [else (+ (with-handlers ([exn:fail:contract?

                               (lambda (err)

                                 (display "Couldn't Parse ")

                                 (displayln (first rows))

                                 0)])

                 (* (string->number (second (first rows)))

                    (string->number (third (first rows)))))

               (pcost (rest rows)))]))

  (define csv-rows (call-with-input-file filename csv->list))

  (pcost (rest csv-rows)))

The last exercise was to write a program that called the function with a filename passed as a command line argument. After a little searching, I found that (vector-ref (current-command-line-arguments) 0)) will provide the first command line argument. Here's the final program:

(require csv-reading)


; get the filename from the command line

(define filename (vector-ref (current-command-line-arguments) 0))


; define the function to calculate the cost of the portfolio

(define (portfolio-cost filename)

  ; pcost - an internal function to perform the "loop"

  (define (pcost rows)

    (cond

      [(empty? rows) 0]

      [else (+ (with-handlers ([exn:fail:contract?

                               (lambda (err)

                                 (display "Couldn't Parse ")

                                 (displayln (first rows))

                                 0)])

                 (* (string->number (second (first rows)))

                    (string->number (third (first rows)))))

               (pcost (rest rows)))]))

  (define csv-rows (call-with-input-file filename csv->list))

  (pcost (rest csv-rows)))

; calculate the cost of the portfolio in the file provided

(define total-cost (portfolio-cost filename))

(display "Total cost: ")

(displayln total-cost)


The final program is portfolio-cost.csv, the translation of the examples is functions.rkt.

That concludes the Introduction. Next comes Working With Data which covers many of Python's more complex datatypes. Its first section covers an introduction to data structures.


Monday 3 August 2020

(practical-python->racket 1.6 Files)

Section 1.6 Files of Practical Python Programming follows a similar to the previous two sections. It does have one exercise that calls for a complete program but it is such a short one that I included all the examples and exercises in the single file - files.rkt

The Python examples and exercises are designed to be run from the Python console started from a specified directory. I cannot be certain from where files.rkt will be run so in the script I needed to make sure the correct filepaths, relative to the directory from which the script was started, would be used. This is a perenial banana skin for me in most languages. I found a possible solution by referring to Stack Overflow. It wasn't elegant but it did work. Thankfully I checked on the Racket Mailing List and was kindly pointed to the define-runtime-path function which, happily for me, takes care of the problem for you.

You can achieve the same in Python but, as far as I know, not quite so easily.
Racket
(define-runtime-path f "my-file.txt")
Python
f = os.path.join(os.path.dirname(__file__), 'my-file.txt')

The examples start with opening text files and reading and writing their contents into and out of strings. Translating them was relatively straight forward. The only thing that I note is that in Racket you open a file but then refer to it as a port. (Again this may be a lack of understanding on my part). This might be a little confusing to a newcomer but, having been introduced to ports in Rebol, it didn't have that effect on me.

(define f (open-input-file foo.txt #:mode 'text))
(port->string f)
(close-input-port f)

This is maybe a little academic as Racket's file->string function reads the contents of a file into a string, automatically opening and closing the file. (The same is true of Racket's write-to-file function for writing a string to a file.)

The Python examples for processing data use Python's context manager feature which automatically closes a file once it has been processed. Racket's file->string and file->lines do the same more concisely as there in no need to explicitly open the file as there is with a Python context manager.

The Python next function is used in Exercise 1.26 File Preliminaries. As the example only calls next() once, I used Racket's first and rest on a list of lines read from the file to simulate it.

I also came up with a solution that was closer to the python example:
  (define f (open-input-file Data/portfolio.csv #:mode 'text)))
  (define headers (read-line f)))   ;"remove" the headers from the port 
  (define rows (port->liines f))    ; read the remainder of the file
  (close-input-port f))
  (displayln headers))
  (for-each (lambda (r) (displayln r)) rows))

The one example that called for separate program was to calculate the cost of a share portfolio where the number of shares and their individual costs was stored in a csv file. Here is my solution, a pretty straight forward recursive function:

  (define csv-rows (file->lines Data/portfolio.csv #:mode 'text)))
  (define (pcost rows)
    (cond
      [(empty? rows) 0]
      [else (define row (string-split (first rows) ","))
            (+ (* (string->number (second row))
                  (string->number (third row)))
               (pcost (rest rows)))]))
 (display "Total cost "))
  (displayln (pcost (rest csv-rows))))

The final exercise was to open a gzipped version of a csv file. Both Python and Racket required gzip libraries to be imported/required from their standard libraries. Though neither requires an additional library to be installed. The only item of note is that the Racket gunzip function requires an output string to be passed as a parameter into which it decompressed the data. At first glance, Racket's output-strings seem similar to Python StringIO objects. I fully expect to have to find out more as I progress with translating Practical Python Programming. Here is my translation of the Python example:

  (require file/gunzip)
  (define gzip (open-input-file "Data/portfolio.csv.gz"))
  (define csv (open-output-string))
  (gunzip-through-ports gzip csv)
  (displayln (get-output-string csv))

So that wraps up the Introduction to Files. Next up is 1.7 Functions which I fully expect to require some study on my part before I'll be able to come up with a translation.