
The NPPES Data
CMS.GOV Responsibility​​
The job of collecting and disseminating NPI data fell on the Centers for Medicare and Medicaid (CMS.GOV) and came to as a result of the HIPAA Legislation we discuss here.​ CMS named the system the National Plan and Provider Enumeration System.
​
If you go to the NPPES site you will see they are collecting and storing information from healthcare providers. Every month, around the 10th, CMS.GOV exports this database to a CSV file and publishes it on their website, here. Newly issued NPI's are included, discontinued NPI's are removed and any changes to a healthcare provider information is updated.
​
The CMS.GOV NPPES data is the de-facto standard for information about National Provider Identities (NPI's) and the healthcare provider associated with the NPI. The CSV file that CMS publishes is the only source of NPI data, all other sources are derived from it.
​
It's up to you to figure out how to use the CSV file and it is not trivial.
​
Quantifying The Information
The CSV file you download from the NPPES website is about 10 gigabytes in size. There is a vast quantity of information in here with about 9 million healthcare providers with each NPI containing just under 400 data-points. That means there are 3.6 BILLION data-points that have to be looked at and evaluated where it should go.
When reverse-engineering the CSV file we wound up with 8 normalized tables, all of which hold multiple records for each NPI. We use Object Role Modeling to design databases.
​
You wind up with about 18 million addresses, about 36 million phone numbers and around 12 million taxonomy codes assigned to healthcare providers. The MediSeek Recaster database, full of data and indexed for quick access, is about 80 gigabytes in size.
​
More recently, CMS has added two additional data files, one containing other names used by an NPI and other practice locations that a healthcare provider may practice at. Recaster does process these two additional CSV files into the one relational database.
​
Who Uses The NPPES Data?
​​
Larger healthcare companies do transform the NPPES data to a relational database for their own internal use. Other companies do so also, one example being medical billing software. These large corporations know the value of this data, have you seen any large corporation say "Hey, do you want to use our NPPES derived relational database?" They do not offer this, not at any price has any large corporation made publicly available their databases derived from the CMS.GOV NPPES database exports.
​
Why would they not? Because they spent a fortune to develop programs to use the data and they would have to charge too much money. Furthermore, they don't want to risk a competitor using their software for pennies on the dollar.
Who Else Publishes A Usable Version Of The NPPES Data?
The key-word is 'usable'. Most of the attempts you see online at unlocking the NPPES database CSV file are half-hearted. There are very few organizations that compete with MediSeek and offer to the general public the NPPES data in a relational database format.
