Convert FCC ULS Database To SQL


The FCC provides databases of license information called the Universal Licensing System (ULS). The data is useful for number of reasons and comes in a pipe-delimited format.


Table of Contents

  1. Why
  2. Building a Tool


Why

The FCC provides online search capability, however it can be slow at times. I also wanted a way to manipulate the data in ways not available using the FCC online search tool. My goal was to get this into an SQLite database as both a learning experience and to prepare for experimenting with large datasets.


Building a Tool

The ULS data files obtained from the FCC contains pipe-delimited data, which can be imported into any common spreadsheet application such as Microsoft Excel. In order to get this converted to valid SQLite schema, I needed to understand the fields.

The FCC provides a PDF file containing the definitions. Since the PDF file is over 80 pages of information, I wanted to come up with a programmable way of extracting the definitions from the PDF file. I tried a few different methods using various PDF modules available in Python, but nothing was reliable. The tables in the PDF file in particular caused most of the headaches in trying to come up with a reliable parser.

Sheepishly, it took me a bit to realize that the FCC also provides a plain text SQL definitions file, which could be reliably parsed.

create table dbo.PUBACC_A2
(
      Record_Type               char(2)              null,
      unique_system_identifier  numeric(9,0)         not null,
      ULS_File_Number           char(14)             null,
      EBF_Number                varchar(30)          null,
      spectrum_manager_leasing  char(1)              null,
      defacto_transfer_leasing  char(1)              null,
      new_spectrum_leasing      char(1)              null,
      spectrum_subleasing       char(1)              null,
      xfer_control_lessee       char(1)              null,
      revision_spectrum_lease   char(1)              null,
      assignment_spectrum_lease char(1)              null,
      pfr_status    char(1)        null
        
)
I used this information to extract the record type, which is the two characters after the underscore in the table name. For example, in the above table dbo.PUBACC_A2, A2 is the record type. The fields are then listed in the table. I used this information to create a parser to build a table of valid record types, accounting for any new record types and associated fields that might be added or removed by the FCC in the future.

Once valid record types could be identified, I was able to run through a ULS database and carve out valid record types and place them into CSV files, one per record type. Part of this process included adding a header row to the CSV file, containing all the columns which will be used to create the SQLite table schema.

For the SQL schema, my goal was to use a single database, and each table within would correspond to a valid record type from the the corresponding CSV file. Inside the table would be the valid columns generated from the CSV file header row for each record type.

With the generated CSV files for each record, I used the sqlite-utils Python module to import the data into a SQLite database.

Once the data is populated into an SQLite database, you can perform any number of SQL queries against it to access the data in meaningful ways. Find expired licenses, locate frequencies assigned to businesses, and more.

It was a fun project, with plenty of room for improvement. Some of the larger databases take a while to process and would probably benefit from some multiprocessing to speed things up. I made an initial attempt at making this happen, but I'm still wrapping my head around the concept.

You can download the tool here.