The PBO was asked for clarification regarding the quality of the dataset used in our analysis of the Investing in Canada Plan (IICP).
In February 2020, we asked all federal departments and agencies responsible for spending IICP money for a list of the projects they funded between 2016-17 and 2027-28.
By early May, the Government was able to provide us with a list of 33,122 project-level “entries”. The Government assured us that each of these “entries” corresponded to a unique project. However, critical information such as project name and location are missing from a small number of the “entries” provided.
Additionally, there were “entries” on the list where the project name, city and province were identical, which we pointed out to department(s). Departments indicated these “entries” were independent, unique projects. However, this could not be validated by us. Within our data set of 33,122 entries, we estimate there were roughly 14,000 entries that have non-unique information. Data was cleaned by the PBO to ensure consistency across entries. Data cleaning included detecting entries labelled as “not applicable”, “unallocated”, or for which information was only provided on a national level without a specific city and corrected to ensure accurate and consistent records across departments.
After discussion with Infrastructure Canada, the PBO decided to nonetheless assume that each entry with funding paid out represents a unique project, despite incomplete or missing information or, in other cases, the same projects being reported over multiple entries. Entries with incomplete information and that did not have money paid out were assumed to be planned projects for which funding would flow after 2019-20.
The amount of work required by our staff on data provided to the PBO by departments involved in the IICP is unusual. This indicates a need for improvements in data collection on projects funded by the IICP.