To the main website of James Tam Return to the CPSC 203 web page

A4: VBA Programming For Excel

Due dates for all assignments can be found on the [graded components] section of the course webpage.

This assignment allows you to work on a real world problem (analyzing and visualizing actual Covid infection data in Alberta.

Back ground information (by James Tam).

JT: assignment requirements aren't directly included in this section and the reader looking only for that information can skip directly to the text under the next heading "Starting files that you should modify and submit". The background information is provided for those who are interested in learning what motivated me to write a program to track infection rates which eventually became the genesis for an assignment in CPSC 203.

Sometime after infections began to occur in Canada there was constant message about the need to "flatten the curve" (a decrease in the slope of a function that plots infection rates) and things were listed that could be done to attain that goal. To keep the public informed about the progress of the infections, agencies such as Alberta Health Services (AHS) would release information that included: infection rates, age demographics, gender as well information about the people who had unfortunately succumb to the illness. Figure 1 graphs the information about the progress of infection rates and fatalities based on the AHS information. This graph was created from actual AHS data and the instances of new infections and fatalities were counted and graphed automatically using an Excel VBA program. 

Figure 1: Graphing new infections (line higher up in blue) vs. the number of infected people who have passed away (the lower line in a short of orange-red color).

In the early days of the infections in the spring  of 2020 when the number of new infections was lower, the number of new infections vs. those who died could be done in one graph. The difference between the numbers wasn't so large that trends could still be seen over time. With the increase in infection numbers in the fall it may be easier to determine trends by two separate graphs (Figure 2).

Figure 2A: Number of new infections
 
Figure 2B: Number of people who passed away

Of course it is easier to see change in the infection rate if only new infections are charted. While it may be possible to see the decreasing curve when charting total infections at a particular point in time it's harder to see trends until they become extreme. While I was happy to view other people's pre-created visualizations I found that information available last spring lacking. Like everyone else I wanted to see a sign of hope that things would get better (trying to keep "hope in the heart" as it were) and because only total infection rates were provided by AHS (it was for instance provided by the Quebec equivalent of AHS) I ended up producing this information myself.

However, I found that AHS was quite open and transparent in the actual data regarding infection rates and it can be found on the publically accessible website: https://www.alberta.ca/stats/covid-19-alberta-statistics.htm. Click under "Data Export" and the page will display actual information about infections in Alberta in tabular format. Each row is actual information about a person that was infected with Covid in the province. (Since the information was available in a public website I assumed that it met privacy requirements. I did however get verbal permission of a supervisor from AHS before designing this assignment. As of November 27, 2020 there were 56,444 people who were infected. Scroll to the bottom of the page click on the "Excel" button to download an actual spreadsheet with this information. Keep in mind that the data is set is large so at this point the creation of the spreadsheet may take a fair amount of time. Don't close or interrupt the browser until the process is complete.

With the raw data in hand I at first I had no desire to write a program to count the infection numbers. I'm of the attitude why write a program if that capability is already provided in the Excel functions. While graphing functions abound in Excel, the ability to tally infection rates would be awkward (new instances of the COUNTIF and COUNTIFS functions would be needed for each day and the range would have to be constantly updated as AHS added more rows of data to the information they released. Thus had to I write a program that would help me visualization the information.

Now of course many media outlets (if you're interested I've found the CBC to be one of the best in this regard) have hired visualization specialists to not only graph the progress of the disease over time but some have even created animated changes over time or interactive displays. So there is very little need for an individual person to plot the information on their own. However, I still decided to use the gist of the problem as an assignment: the plotting of Alberta Covid data. It seemed appropriate - since the exam components were dropped because it was a distance learning semester required by the Covid pandemic and a new assignment and at least one workbook exercise was needed to fill the gap - an assignment about the pandemic seemed appropriate. Plus it provides students with experience analyzing and visualizing a large data set from a compelling real world problem. Finally past students have asked me to include an Excel VBA programming component because they have indicated, for their area of study at least, that there are job opportunities requiring knowledge of Excel VBA programming. So this assignment seemed to fill all these needs.

Starting files that you should modify and submit:

Complete the features specified below and submit the macro enabled spreadsheet along with the starting Word document into D2L.

Testing your program

Saving your macro

It's critical that you save (or contain) your macros in the above spreadsheet [Figure 3] when you first create the macro (via View->Macros in the Excel ribbon). DO NOT save it in "All Open workbooks". Doing the latter will save your macro in another spreadsheet that is local to your computer. (This means that when your marker downloads your submission the code will not be available and you will receive a zero).

Figure 3: Saving the macro in a container spreadsheet that allows macros to be enclosed (.xlsm)

Periodically you should back up your spreadsheet (e.g. at the very least upload it into D2L) and you should copy-paste your program into the provided Word document (and submit that into D2L as well). That way you will have a fall back if disaster strikes so you have a copy to retrieve. While you aren't directly required suggestion is included in the assignment description for your benefit. It may give you a fall back if things don't go well. If you ignore these suggestions and run into problems then you will not be given any special considerations (e.g. extra time, resubmitting the work, alternative forms of evaluation etc.)

Functional program requirements: write a VBA macro that includes the following features (failing to follow [style requirements] and [documentation requirements] will also have an affect on your grade).

All the features must be implemented with Excel VBA instructions. No credit will be awarded if the feature is simulated by invoking the feature via the Excel ribbon or via an Excel function such as COUNTIF. The use of VBA functions (e.g. MsgBox) and methods (e.g. ActiveSheet.Shapes.AddChart2) is not only 'allowed' but it is required. Of course manually typing information such as the infection count into the spreadsheet will earn no credit.

  1. Write the text "Total cases" into the cell I1. (0.2 GPA)
  2. Counts the total number of Covid cases (includes all cases status such as 'Recovered', 'Active' etc.) and displays this information in a MsgBox when the program has counted all the infection numbers in the spreadsheet. The count must be conducted with a loop and variable is used to track the count. (1.1 GPA)
  3. Writes the count from the previous feature (i.e. the MsgBox output) to Cell J1(0.2 GPA)
  4. The two Cells where the information written from the previous two features are bolded. (0.1 GPA)
  5. The data in the spreadsheet is sorted by date (earliest to latest). (Max 0.3 GPA for sorting)
    1. Only a hard-coded (fixed) range such as B2 to G21 is sorted. (0.1 GPA)
    2. All rows which contain Covid data will be sorted. If there are two rows of data then only those two rows will be sorted. If there are 55,000 rows of infection data then all those rows be sorted. This is obviously the superior approach because the number of cases may increase over time. (0.3 GPA)
  6. The date in which infection data appears in the spreadsheet (i.e. the entry of a date in the rows of Column B signify that an infection occurred on that date) will be written into Column I. Regardless of the number of infections that occurred on a particular day, the date information is only written into Column I once. [Image] (0.4 GPA)
  7. (Nesting is mandatory) The number of new infections for a particular date are written into the rows of Column J with the first date appearing at Cell J3. Because the date only appears once in Column I the number of infections for a particular day will appear only once on one row. [Image] (0.6 GPA)
  8. (Nesting is mandatory) The number of people who passed away on particular date will be written into the rows of Column K. Similar to new infections only the total number of people who passed away on a particular date will be written and written once. [Image] (0.6 GPA)
  9. (To get credit for this feature the previous 3 features need to correctly and completely implemented). Insert a chart that graphs: the date, number of infections for each date, the number who passed away on each date. Whether one chart is used for new infections and deaths, or one chart graphs infections over time while the other graphs deaths over time the same credit will be awarded. Acceptable charts include: line, bar or column. Unacceptable graphs for this type of data include: pie charts and donut charts. Other graphs may be acceptable but avoid 3D representations because they may not add any thing to the visualization and often make it harder to interpret results. If you're concerned that your choice of graph will affect your grade it's best to stick to one of the 3 specified types and use a 2D form only. Similar to the feature where you sort the data, the graph should only include in its range rows that contain Covid data. A fixed range based on number of cases that happen to be found in a particular spreadsheet should not be used. (0.3 GPA)
  10. (Requires the previous feature to be complete and correct). The chart title must be changed to include the text "Alberta Covid statistics". If you have a separate chart for new infections and deaths then each chart can include this text plus more specific information e.g. "Alberta Covid statistics: New infections" and "Alberta Covid statistics: Number of people who passed away". (0.1 GPA)

You can assume that infection data is continuous. The end of the infection data will always be followed by an empty row in the spreadsheet. That means that your program won't have to handle a case where a blank line in the spreadsheet separates infection data

Features that must employ nesting #7 & #8. Both of the following approaches are valid, some students might find the second approach somewhat less complex.

Videos

Documentation requirements

  1. Contact information: your full name, student identification number and (New For A4) your tutorial number. (0.1 GPA)

  2. New For A4: Demonstrate some evidence of a versioning system. The program that you submit must specify at least one version number (a date is acceptable) (0.05 GPA). That version must clearly, completely and correctly specify what features of the assignment that you completed or didn't complete. If your program includes more than one version then list the features completed for each version. [Example] More information about versioning (and documentation) is provided in [the VBA Part I notes and lectures] (0.25 GPA)

Style requirements

  1. Each level of code indenting is consistently 1 tab. Instructions in the sub-routine (between the 'sub' and 'end-sub' is 1 level, the body of IF or WHILE structures counts as another level of indenting. An IF within an IF or WHILE within a WHILE (or even  an IF within a WHILE or a WHILE within an IF) each count as another level. Penalty: -0.2 GPA applied to any case (and not each case) where the required indenting is not applied and applied consistently.
  2. Good naming conventions (e.g. variables, sub-routines, the name of Word document containing the VBA program and constants if applicable) are followed. Some examples of naming conventions are provided in [the VBA Part I notes]. Penalty: -0.2 GPA applied to any case (and not each case) where poor naming conventions have been used.
  3. New for A4: The use of named constants (One of many possible examples: Const EMPTY_CELL As String = "") as appropriate. Named constants were first introduced in the first VBA programming set of notes. Examples have been shown in some of the subsequent Word examples and many were shown in the two "VBA Extras" lectures. Penalty: -0.2 GPA applied to any case (and not each case) where named constants could have been used but were not.

Marking and grading

Collaboration:

Assignments must reflect individual work; group work is not allowed in this class nor can you copy the work of others. Some "do nots" for your solution: don't publically post it, don't email it out, don't show it to other students, don't get help with your assignment from a tutor (if you have hired one).  For more detailed information as to what constitutes academic misconduct (i.e., cheating) for this course please read the following [link].

Submitting your work:

D2L configuration for this course

Late submissions for assignments  when there is no extension granted: Make sure you give yourself enough time to complete the submission process so you don't get cut off by D2L's deadline (or your submission will be automatically flagged as late by D2L and it will be graded appropriately)..

Submission received:

On time

Hours late : >0 and <=24

Hours late: >24 and <=48

Hours late: >48 and <=72

Hours late: >72 and <=96

Hours late: >96

Penalty:

None

-1 GPA

-2 GPA

-3 GPA

-4 GPA

No credit (not accepted)

Acknowledgment

I would to thank the AHS staff who provided me with the permission to use the Covid-19 infection data for Alberta.