A4: VBA Programming For Excel
Due dates for all assignments can be found on the [graded
components]
section of the course webpage.
This assignment allows you to
work on a real world problem (analyzing and visualizing actual Covid infection
data in Alberta.
Back ground information (by James Tam).
JT: assignment requirements aren't directly included in this section and the
reader looking only for that information can skip directly to the text under the
next heading "Starting
files that you should modify and submit". The background information is
provided for those who are interested in learning what motivated me to write
a program to track infection rates which eventually became the genesis for an
assignment in CPSC 203.
Sometime after infections began to occur in Canada there was constant message
about the need to "flatten the curve" (a decrease in the slope of a function
that plots infection rates) and things were listed that could be done to attain that goal.
To keep the public informed about the progress of the infections, agencies such
as Alberta Health Services (AHS) would release information that included:
infection rates, age demographics, gender as well information about the people
who had unfortunately succumb to the illness. Figure 1 graphs the information
about the progress of infection rates and fatalities based on the AHS
information. This graph was created from actual AHS data and the instances of
new infections and fatalities were counted and graphed automatically using an
Excel VBA program.
Figure 1: Graphing new infections
(line higher up in blue) vs. the number of infected people who have passed away
(the lower line in a short of orange-red color).
In the early days of the
infections in the spring of 2020 when the number of new infections was lower, the number
of new infections vs. those who died could be done in one graph. The difference
between the numbers wasn't so large that trends could still be seen over time.
With the increase in infection numbers in the fall it may be easier to determine
trends by two separate graphs (Figure 2).
|
Figure 2A: Number of new
infections |
|
|
Figure 2B: Number of people who
passed away |
Of course it is easier to see
change in the infection rate if only new infections are charted. While it may be
possible to see the decreasing curve when charting total infections at a
particular point in time it's harder to see trends until they become extreme.
While I was happy to view other people's pre-created visualizations I found that
information available last spring lacking. Like everyone else I wanted to see a
sign of hope that things would get better (trying to keep "hope
in the heart" as it were) and because only total infection rates were
provided by AHS (it was for instance provided by the Quebec equivalent of
AHS) I ended up producing this information myself.
However,
I found that AHS was quite open and transparent in the actual data regarding
infection rates and it can be found on the publically accessible website:
https://www.alberta.ca/stats/covid-19-alberta-statistics.htm. Click under
"Data Export" and the page will display actual information about
infections in Alberta in tabular format. Each row is actual information about a
person that was infected with Covid in the province. (Since the information was
available in a public website I assumed that it met privacy requirements.
I did however get verbal permission of a supervisor
from AHS before designing this assignment. As of November 27,
2020 there were 56,444 people who were infected. Scroll to the bottom of the
page click on the "Excel" button to download an actual spreadsheet with this
information. Keep in mind that the data is set is large so at this point the
creation of the spreadsheet may take a fair amount of time. Don't close or
interrupt the browser until the process is complete.
With the raw data in hand I at first I had no desire to write a program to count
the infection numbers. I'm of the attitude why write a program if that
capability is already provided in the Excel functions. While graphing functions
abound in Excel, the ability to tally infection rates would be awkward (new
instances of the COUNTIF and
COUNTIFS functions would be needed for each day and the range would have
to be constantly updated as AHS added more rows of data to the information they
released. Thus had to I write a program that would help me visualization the
information.
Now of course many media outlets
(if you're interested I've found the
CBC to be
one of the best in this regard) have hired visualization specialists to not only graph the progress of the
disease over time but some have even created animated changes over time or
interactive displays. So there is very little need for an individual person to
plot the information on their own. However, I still decided to use the gist of
the problem as an assignment: the plotting of Alberta Covid data. It seemed
appropriate - since the exam components were dropped because it was a distance
learning semester required by the Covid pandemic and a new assignment and at
least one workbook exercise was needed to fill the gap - an assignment about the
pandemic seemed appropriate. Plus it provides students with experience analyzing
and visualizing a large data set from a compelling real world problem. Finally
past students have asked me to include an Excel VBA programming component
because they have indicated, for their area of study at least, that there are
job opportunities requiring knowledge of Excel VBA programming. So this
assignment seemed to fill all these needs.
-
A Excel macro enabled
spreadsheet: [A4_Covid_Data].
This file is a macro enabled Excel spreadsheet that you can use to contain
or save your program. [Figure 3] shows the macro
being saved to spreadsheet for workbook exercise. For this assignment you
should save your A4 macro into the
A4_Covid_Data spreadsheet. You are given
starting data as a convenience. Your program should not be written only to
handle the data in this file (in this file it's just over 600 Covid cases).
Instead your program must be written so that the
program features work for any data set so long as the format is
the same as this spreadsheet (e.g. the dates are in Column
B, the cases status is in
Column F etc.).
-
Word 2007 document: [A4_backup].
Copy-paste your program into this document in order to back up your work.
Complete the features specified below and submit the macro enabled spreadsheet
along with the starting Word document into D2L.
Testing your
program
-
Your program
should not be written specifically handle the starting data in the [starting
spreadsheet provided]
-
Instead you
should create your own Excel data sets in order to test that your program is
producing correct results under different scenarios e.g. few cases, many
cases, no cases where people have died, many/all cases have died etc.
-
To make it
easier to check your results it's suggested that most of your test cases
include a smaller number of lines containing Covid cases. (I show you an
example of how you can modify the test data during the lecture on March 31).
-
You should be
prepared for your marker trying out their own test files so you should
perform a reasonable amount of testing because if an error shows up while
they are marking then your mark can be affected.
-
To get you
started here is an [additional test
spreadsheet] that you can use.
Saving your
macro
It's critical that you
save (or contain) your macros in the above spreadsheet [Figure 3]
when you first create the macro (via
View->Macros
in the Excel ribbon).
DO NOT save it in "All Open workbooks". Doing the latter will save your
macro in another spreadsheet that is local to your computer. (This means that
when your marker downloads your submission the code will not be available and
you will receive a zero).
Figure 3: Saving the
macro in a container spreadsheet that allows macros to be enclosed (.xlsm)
Periodically you should back up your spreadsheet (e.g. at the very least upload
it into D2L) and you should copy-paste your program into the provided Word
document (and submit that into D2L as well). That way you will have a fall back
if disaster strikes so you have a copy to retrieve. While you aren't directly
required suggestion is included in the assignment
description for your benefit. It may give you a fall back if things don't go
well. If you ignore these suggestions and run into problems then you will not
be given any special considerations (e.g. extra time, resubmitting the work,
alternative forms of evaluation etc.)
Functional program requirements: write a VBA macro that includes the following features
(failing to follow [style requirements]
and [documentation requirements] will
also have an affect on your grade).
All the features
must be implemented with Excel VBA instructions. No credit will be awarded if
the feature is simulated by invoking the feature via the Excel ribbon or via an
Excel function such as COUNTIF.
The use of VBA functions (e.g. MsgBox)
and methods (e.g.
ActiveSheet.Shapes.AddChart2) is not only
'allowed' but it is required. Of course manually typing information such as the
infection count into the spreadsheet will earn no credit.
-
Write the text "Total cases" into the cell
I1.
(0.2
GPA)
-
Counts the total number of Covid cases
(includes all cases status such as 'Recovered', 'Active' etc.) and displays
this information in a MsgBox
when the program has counted all the infection numbers in the spreadsheet.
The count must be conducted with
a loop and variable is used to track the count. (1.1
GPA)
-
Writes the count from the previous feature
(i.e. the MsgBox
output) to Cell J1.
(0.2
GPA)
-
The two Cells where the information written
from the previous two features are bolded.
(0.1 GPA)
-
The data in the spreadsheet is sorted by date
(earliest to latest).
(Max
0.3 GPA
for sorting)
- Only a hard-coded (fixed)
range such as B2
to G21
is sorted. (0.1 GPA)
- All rows which contain Covid data
will be sorted. If there are two rows of data then only those two rows
will be sorted. If there are 55,000 rows of infection data then all
those rows be sorted. This is obviously the superior approach because
the number of cases may increase over time. (0.3
GPA
)
The date in which infection data appears
in the spreadsheet (i.e. the entry of a date in the rows of Column
B
signify that an infection occurred on that date) will be written into Column
I.
Regardless of the number of infections that occurred on a particular day,
the date information is only written into Column
I once.
[Image]
(0.4 GPA)
(Nesting is mandatory) The number of new
infections for a particular date are written into the rows of Column
J
with the first date appearing at Cell
J3. Because the date only appears once
in Column I
the number of infections for a particular day will appear only once on one
row. [Image]
(0.6 GPA)
(Nesting is mandatory) The number of
people who passed away on particular date will be written into the rows of
Column K.
Similar to new infections only the total number of people who passed away on
a particular date will be written and written once. [Image]
(0.6
GPA)
(To get credit for this feature the
previous 3 features need to correctly and completely implemented). Insert a
chart that graphs: the date, number of infections for each date, the number
who passed away on each date. Whether one chart is used for new infections
and deaths, or one chart graphs infections over time while the other graphs
deaths over time the same credit will be awarded.
Acceptable charts include: line, bar
or column. Unacceptable graphs for this type of data include: pie charts and
donut charts. Other graphs may be acceptable but avoid 3D representations
because they may not add any thing to the visualization and often make it
harder to interpret results. If you're concerned that your choice of graph
will affect your grade it's best to stick to one of the 3 specified types
and use a 2D form only. Similar
to the feature where you sort the data, the graph should only include in its
range rows that contain Covid data. A fixed range based on number of cases
that happen to be found in a particular spreadsheet should not be used. (0.3
GPA)
(Requires the previous feature to be
complete and correct). The chart title must be changed to include the text
"Alberta Covid statistics". If you have a separate chart for new infections
and deaths then each chart can include this text plus more specific
information e.g. "Alberta Covid statistics: New infections" and "Alberta
Covid statistics: Number of people who passed away".
(0.1 GPA)
You can assume that infection data is continuous. The end of the infection data
will always be followed by an empty row in the spreadsheet. That means that your program won't have to
handle a case where a blank line in the spreadsheet separates infection data
Features that must employ nesting #7 & #8.
Both of the following approaches are valid,
some students might find the second approach somewhat less complex.
- Approach #1: Nested loops. The first loop (outer) runs from
the first row containing Covid data at the third row and running until
there are no more rows containing Covid data. In the [second
spreadsheet provided for testing] this loop will run 19 times once
for each row of Covid data. The second loop (nested inner loop) runs
through the Covid data for a particular day. In the second modified
spreadsheet it will run: once for March 6, eight times for March 9, ten
times for March 10. The nesting is similar to the [lecture
example: nested loops] that tallies the number of students in each
tutorial.
- Approach #2: Nesting using a single loop that contains nested
IFs (instead of a nested loop). In the
modified starting spreadsheet there is still an outer loop than runs 19
times. But each time the loop runs the program checks to see if the
program is still looking at the current sequence of students in the same
tutorial via the IF-ELSE branch. [Lecture
example: IF nested within a loop]
Videos
-
Part I: Background information, not mandatory to complete your assignment
but it's for anyone is curious as to why I took the time to write a program to visualize Covid-19
data, how it became an assignment. [Link
to video]
-
Part II: Going over assignment requirements and showing the execution of my
solution. [Link
to video]
-
Contact information: your full name, student identification number and (New
For A4) your tutorial number. (0.1
GPA)
-
New For A4: Demonstrate some evidence of
a versioning system. The program that you submit must specify at least one
version number (a date is acceptable) (0.05
GPA). That version must clearly,
completely and correctly specify what features of the assignment that you
completed or didn't complete. If your program includes more than one version
then list the features completed for each version. [Example]
More information about versioning (and documentation)
is provided in [the VBA Part I
notes and lectures] (0.25
GPA)
- Each level of code indenting is
consistently 1 tab. Instructions
in the sub-routine (between the 'sub'
and 'end-sub' is 1
level, the body of IF or WHILE structures
counts as another level of indenting. An IF within
an IF or WHILE within
a WHILE (or even
an IF within a WHILE or
a WHILE within an IF) each
count as another level. Penalty: -0.2 GPA applied
to any case (and not each case) where the required indenting is not applied
and applied consistently.
- Good naming conventions (e.g. variables,
sub-routines, the name of Word document containing the VBA program and
constants if applicable) are followed. Some examples of naming conventions
are provided in [the VBA Part I
notes]. Penalty: -0.2 GPA applied
to any case (and not each case) where poor naming conventions have been
used.
- New for A4: The use of named constants (One of many
possible examples: Const EMPTY_CELL As
String = "") as
appropriate. Named constants were first introduced in the first VBA
programming set of notes. Examples have been shown in some of the subsequent
Word examples and many were shown in the two "VBA Extras" lectures. Penalty: -0.2
GPA applied to any case
(and not each case) where named constants could have been used but were not.
-
Operating system specific issues: submissions
must work on the machines a Windows computers. (For the remote learning
semester since access to the lab computers is more challenging: the
requirement is that your document works on any Windows computer). It's up to
you to test and check this is the case. Non-functional submissions will
receive only partial credit (if any at all). Although
operating system issues are rare (non-existent?) if you implement your
solution on another operating system it's a good idea to check your
submission on a Windows computer. (See the administrative video lecture
loaded into D2L for the week of
Jan. 9 - 15 for suggestions as to how you can do this). Caution for
Windows users: take care that you don't
accidentally submit a shortcut to a file instead of the actual file.
(Check the file name and compare the file size to your original file. Simply
downloading the shortcut file as a test won't work because that shortcut
will work on your computer but not on anyone else's machine).
-
Late assignments or
components of assignments: Extensions may
be granted for reasonable cases by the course instructor with the receipt of
the appropriate documentation (e.g., a
statutory declaration with a commissioner of oaths). Location of the
online form: [PDF
file]. Alternate submission mechanisms (non exhaustive list of examples:
email, uploads to cloud-based systems such as Google drive, time-stamps, TA
memories) cannot be used as alternatives if you have forgotten to submit
work or otherwise have not properly submitted into D2L. Only files
submitted into D2L by the due date is what will be marked,
everything else will be awarded no credit. The final cut off date after
which full assignments will not be accepted is after the maximum
progressive penalty (listed below) can be applied.
-
Assignment extensions (must
be requested before the assignment is due): may be granted for reasonable
cases by the course instructor with the receipt of the appropriate
documentation (e.g., a
sworn declaration signed by a commissioner of oaths). Typical examples
of reasonable cases for an extension include: illness or a death in the
family. Example cases where extensions will not be granted include
situations that are typical of student life: having multiple due dates, work
commitments, forgetting or mixing up due date etc. Tutorial instructors
(TAs) will not be able to provide extension on their own and must receive
permission from the course instructor first. If
you request an extension from me let me know the name of your tutorial
instructor and the tutorial number [link to : tutorial
and TA information) because the markers won't accept
late submissions without directly getting an email from me and I need to
know who to contact. Also extensions need to be requested before the
due date and not afterward.
-
How you will be graded for assignments.
Feedback will be provided by the Teaching assistant in a [marking
checklist]. Besides seeing your grade point in D2L you can also see the
detailed feedback in this spreadsheet that your TA will enter for each
student because it will uploaded for each assignment into the D2L Dropbox.
You can access the grading sheet in D2L under Assessments->Dropbox and
then clicking on the appropriate assignment link. If you still cannot find
the grading sheet then here is a [help
link]
-
Questions or
concerns about grades have been released: Assignments will be marked by
your tutorial instructor (the "Teaching Assistant" or "TA") for your tutorial
section (instructor information listed here as well). When you have
questions about marking this is the first person that you should be
directing your questions towards. If you still have question after you have
talked to your TA, then you can talk to your course
(lecture) instructor but please indicate in your email that you first
contacted your TA before going into your concerns.
Collaboration:
Assignments must reflect
individual work;
group work is not allowed in this class nor can you copy the work of others.
Some "do nots" for your solution: don't publically
post it, don't email it out, don't show it to other students,
don't get help with your assignment from a tutor (if you have hired one).
For more detailed information as to what constitutes academic misconduct (i.e.,
cheating) for this course please read the following [link].
Submitting your work:
-
The document must be electronically submitted using D2L.
-
Submit a macro-enabled document (docm)
containing your program which should be accessible via the Visual Basic
editor. As a precaution it's recommended that you copy-paste the VBA program
into the body of the Word document that you are submitting in case you
didn't properly save your program in the docm document.
-
Make sure that you [check
the contents of your submitted files]
(e.g., is the file okay or was it corrupted, is it the correct version
etc.). It's your responsibility to do this! (Make sure that you
submit your assignment with enough time before it comes due for you to do a
check). If don't check and there were problems with the submission then you
should not expect that you can "learn your lesson" and simply resubmit.
-
Do not use compression utilities (such as zip) or archiving utilities
(such as tar) otherwise your submission may not be marked. The space savings
in D2L is not worth the extra time required by the marker to process each
submission.
-
How often can you submit: Multiple
submissions are allowed for this assignment: You can (and really should)
submit work as many times as you wish before the due date. Due dates are
strict, only what is in D2L by the deadline is what will be marked. Other
methods of verifying that your work was completed on time (e.g. checking
timestamps, emailed files etc.) will NOT be accepted.
However only the latest versions of each individual document are the ones
that will be marked, everything else will be ignored (because it is not fair
to your marker to sort through multiple versions of your files). In the case
of A1 the marker will just look at the latest version of: Calgary and the
merged letter that the marker can 'click
through'.
Late
submissions for assignments when there is no extension granted: Make
sure you give yourself enough time to complete the submission process so
you don't get cut off by D2L's deadline (or your submission will be
automatically flagged as late by D2L and it will be graded appropriately)..
Submission received: |
On time |
Hours late : >0 and <=24 |
Hours
late: >24
and <=48 |
Hours
late: >48
and <=72 |
Hours
late: >72
and <=96 |
Hours
late: >96 |
Penalty: |
None |
-1 GPA |
-2 GPA |
-3 GPA |
-4 GPA |
No credit
(not accepted) |
I would to thank the AHS staff who provided me with the permission to use the
Covid-19 infection data for Alberta.