Computer Science I for majors by James Tam |
Due Nov 7 at 4 PM
Python is a fairly unusual language in that it uses indentation as part of it's syntax. But tabs and space characters can complicate things. A single tab may look identical to a sequence of spaces when the program is printed or displayed on the screen, but the Python interpreter may see the two as very different levels of indentation. This issue can lead to very difficult-to-find bugs in programs. What we need are tools to 1) allow us to actually see the tabs and spaces as printable characters, and 2) tools to detect and intelligently convert tabs to spaces or spaces to tabs. The second problem is compounded because tabs can be "equivalent to" any number of spaces (typically 2, 3, 4, or 8 spaces are "equivalent to" a tab).
Obviously, these 4 functions will not always be used at the same time. Therefore, we will use command-line qualifiers to allow the user to specify the subset of functionalities he/she wants. We do this in UNIX-eze, where we use a minus sign to introduce a short qualifier. If our program is called "tabs" and [] represents optional:
tabs [+t] [-t] [-T<integer>] [+v] [-v] [-help]
+t | replaces prefix sequences of spaces of length T with a single tab | |
-t | replaces prefix tabs with sequences of T spaces | |
-T<integer> | the <integer> defines the space-to-tab ratio, T (default=4) | |
+v | changes all spaces, tabs, and newlines to printable (visible) characters | |
-v | undoes the effects of +v | |
-help | prints out help text |
The program will take it's input from standard input (i.e., the user), and output it's results to standard output (i.e., the screen). That will allow us to type in python commands from the console and get the results back immediately after we finish typing a line. The program continues to read input until it encounters EOF (End-Of-File), which a a ctrl-D character on UNIX, Linux, and Mac (and, I believe, a ctrl-Z-return sequence in Windows).
But this kind of program is most useful when it can read a file and place it's output in a different file. So typically, this program will be run with redirected ("<") input from a file, and possibly redirected (">") output to a file. These two command-line operators temporarily redirect input (output) so that it appears you typed in a file (or the output goes to a file, respectively).For example:
$ python3 tabs.py +v +t -T4 < A3.py > temp.out # 4 spaces becomes 1 tab (results in temp.out)will change prefix space sequences of length 4 to tabs in the source file of this program, and save it in temp.out. Then, if we do:
$ python3 tabs.py -v -t -T4 < temp.out > temp2.out # 1 tab becomes 4 spaces (results in temp2.out)we should find (assuming the original file was properly indented by 4s with spaces only) that:
$ diff tabs.py temp2.outfinds no differences (diff is a UNIX utility that can be used to determine if there are any 'differences' between files).
Synopsis: tabs [+t] [-t] [-T<integer>] [+v] [-v] [-help] +t -replaces prefix sequences of spaces of length T with a single tab -t -replaces prefix tabs with sequences of T spaces -T<integer> -the <integer> defines the space-to-tab ratio, T (default=4) +v -changes all spaces, tabs, and newlines to printable (visible) characters -v -undoes the effects of +v -help -prints out this help text +t and -t are incompatible +v and -v are incompatible
In the following examples, "%" is used for the operating system (O/S) prompt, and colour is used as follows:
% python3 tabs.py < A1.py # Redirect standard input from file A1.py. # Your name: Rob Kremer # The output is an exact copy of A1.py # Student ID: 00999888 # Tutorial #: 05 ''' Created on Aug 25, 2014 @author: kremer ''' ... print() print("Weighted mini assignment grade "+"%1.2f"%miniAssn) print("Weighted assignment grade "+"%1.2f"%assn) print("Weighted midterm grade "+"%1.2f"%midterm) print("Weighted final exam grade "+"%1.2f"%final) print("Weighted term grade "+"%1.2f"%(miniAssn+assn+midterm+final)) % # note that the program terminates on its own after reading the input file
% python3 tabs.py +v < A1.py # redirect standard input from file A1.py #·Your·name:·Rob·Kremer¶ # The output is a copy of A1.py with prefix spaces changed to "·", etc. #·Student·ID:·00999888¶ #·Tutorial·#:·05¶ '''¶ Created·on·Aug·25,·2014¶ ¶ @author:·kremer¶ '''¶ ¶
...
print()¶ print("Weighted·mini·assignment·grade·"+"%1.2f"%miniAssn)·¶ print("Weighted·assignment·grade·"+"%1.2f"%assn)·¶ print("Weighted·midterm·grade·"+"%1.2f"%midterm)·¶ print("Weighted·final·exam·grade·"+"%1.2f"%final)·¶ print("Weighted·term·grade·"+"%1.2f"%(miniAssn+assn+midterm+final))¶ % # note that the program terminates on its own after reading the input file
Note that in these examples, my terminal has an 8-space tab (usually the default for terminal programs), whereas the tabs program has a default 4-space tab.
% python3 tabs.py +t pass # input: 8-space indent (which is 2 default 4-space tabs) pass # output: 2 tabs (8 spaces each on the terminal) pass # input: 6-space indent pass # output: a tab and 2 spaces pass # input: 2 spaces and a tab pass # output: 1 tab <ctrl-D>
% python3 tabs.py +t -T8 # same user input as above pass pass # output: 1 tab pass pass # output: 6 spaces pass pass # output: 1 tab <ctrl-D>
% python3 tabs.py +t +v # these 2 runs are exactly the same as the above 2, but with +v to show what's going on pass » » pass¶ pass » ··pass¶ pass » pass¶ <ctrl-D> % python3 tabs.py +t +v -T8 pass » pass¶ pass ······pass¶ pass » pass¶ <ctrl-D>
% python3 tabs.py -t +v pass # input: 2 tabs (8 spaces each on the terminal) ········pass¶ pass # input: a tab and 2 spaces ······pass¶ pass # input: 2 spaces and a tab ····pass¶ <ctrl-D>
%
% python3 tabs.py -HeLp This program can process python program files in the following ways: 1. Change tabs in the indenting to spaces 2. Change spaces in the indenting to tabs 3. Substitute spaces, tabs, and newlines for printable characters, maintaining formating 4. Undo 3. see the synopsis (below) for details on the command line interface. Typically, this program will be run with redirected "<" input from a file, and possibly redirected ">" output to a file. For example: $ python3 A3.py +v +t -T4 < A3.py > temp.out will change prefix space sequences of length 4 to tabs in the text of the input file("A3.py"),
and save it in "temp.out". Then, if we do: $ python3 A3.py -v -t -T4 < temp.out > temp2.out we should find (assuming the original file was properly indented by 4s with spaces only) that: $ diff A3.py temp2.out finds no differences (diff is a UNIX program). Synopsis: tabs [+t] [-t] [-T<integer>] [+v] [-v] [-help] +t -replaces prefix sequences of spaces of length T with a single tab -t -replaces prefix tabs with sequences of T spaces -T<integer> -the <integer> defines the space-to-tab ratio, T (default=4) +v -changes all spaces, tabs, and newlines to printable (visible) characters -v -undoes the effects of +v -help -prints out this help text +t and -t are incompatible +v and -v are incompatible % # the program terminates because only the help text was requested
% python3 A3.py -t +t +v -v -V
Unrecognized argument: -V
Qualifiers +v and -v cannot both be used together.
Qualifiers +t and -t cannot both be used together.
Synopsis:
tabs [+t] [-t] [-T<integer>] [+v] [-v] [-help]
+t -replaces prefix sequences of spaces of length T with a single tab
-t -replaces prefix tabs with sequences of T spaces
-T<integer> -the <integer> defines the space-to-tab ratio, T (default=4)
+v -changes all spaces, tabs, and newlines to printable (visible) characters
-v -undoes the effects of +v
-help -prints out this help text
+t and -t are incompatible
+v and -v are incompatible
% # the program terminated because there were command-line errors
This section is meant to help you with your program. You CAN cut-and-paste the code given here into your code without citing it without fear of penalty. Do not cut-and-paste any other code without citing though!
Since this is a fairly complex program, your TA may use a testing harness to verify that your program will run correctly. Therefore you must make it possible for another program to include and run your program, so all your code that you normally run when you invoke your program from the command line should be embedded within a main program function. In addition, you want to leave calling the main program to any test harness, but still call your program if the file in invoked directly from the command line. To do that, use the following paradigm:
# imports go here
# global constants go here
# all your function definitions go here
def main(): # Or start()
# Body of main() function goes here
main()
You will need to gather the user-specified qualifiers from the command line. To do that you need to use sys.argv from the system library. To import from the system library use the following import statement:
import sys # needed to collect command-line arguments (sys.argv)
sys.argv is a sequence containing the command line arguments. The first element of sys.argv is the program name, and the remainder are the "words" (arguments) the user typed after the program name. These do NOT include expressions (such as redirections ["<", ">", and ">>"] and pipes ["|"]) interpreted by the command-line processor. Thus, the paradigm for reading command lines is as follows:
firstArg = True for arg in sys.argv: if firstArg: # the first argument is always the program name, so ignore it firstArg = False elif (arg=="-t"): ...Examples of working with command line arguments can be found in the "tutorials" directory under the subdirectory "oct19_25/assign_like"
...
else: #if we got here, then we didn't recognize the argument ...
The standard Python input() function does not handle EOF (End-Of-File) (the ctrl-D character) gracefully: If you type a ctrl-D in response to an input(), it will throw an exception (i.e., a run-time error and 'crash') and your program will terminate. The problem is easily solved, but we have not covered that yet. So use the following function in lieu of the standard input() function:
def getInput(): """This function works exactly like input() (with no arguments), except that, instead of throwing an exception when it encounters EOF (End-Of-File), it will return an EOF character (chr(4)).
Returns: a line of input or EOF if EOFError occurs during input. """ try: ret = input() except EOFError: ret = EOF return ret
You might not understand this code since it includes concepts that we haven't covered yet (such as exceptions). That's okay for now -- just make sure you understand how to use it. Also note that unlike Python's input() it does not take any parameters: That's OK because this program might be taking it's input from a file, we should not prompt for each line of input..
An example of using the getInput() function can be found under the 'tutorials' directory under the subdirectory "oct19_25/assign_like" and the program name is display_sentences.py.
EOF = chr(4) # A standard End-Of-File character (ascii value 4) TAB_CHAR = chr(187) # A ">>" character (as a single character) in the extended ascii set # Used to make a tab character visible. SPACE_CHAR = chr(183) # A raised dot character in the extended ascii set # Used to make a space character visible NEWLINE_CHAR = chr(182) # A backwards P character in the extended ascii set # Used to make a newline character visible
For this assignment, you can use any of the Python built-in functions. You may also use any of the build-in string methods; you might find <str>.replace(), <str>.lstrip(), <str>.expandtabs(), <str>.uppper(), and the 'slicing' [<start> : <end>] operator particularly useful.