Transforming pharmaceutical and healthcare industries through revolutionary data intelligence
Upcoming Webinar

Science and Open PHACTS:

Maximising impact for pharmaceutical applications

Date and Time: Monday, June 27, from 10-11AM EDT

To sign up for the webinar.

OpenPHACTS provides a powerful resource for the exploration of biomedical data. However, only when this data is combined with companies' internal data sources can the power of this dataset be fully leveraged. Data2Discovery, Inc. is creating systems that map proprietary and other data sources into semantic linked form that can be combined with OpenPHACTS data in a smart data lake. Combining these data sources will allow researchers across pharmaceutical companies to quickly and easily leverage the power of integrated public and proprietary datasets to make new discoveries. David Wild (CEO, Data2Discovery) will demonstrate Data2Discovery's developments, with an application they have developed for Phenotypic Screening analysis called P3.



We are committed to creating high impact and novel data solutions for pharmaceutical and healthcare customers. These industries are experiencing significant business model and regulatory changes that require new solutions and expertise. Comprised of a unique group of data scientists and industry experts from Indiana University and Silicon Valley, we are solving a variety of extremely challenging business and data problems for our customers.


Our vision is that all healthcare is data driven, and there is value that comes from finding links and an accurate, deep intuitive understanding that crosses different healthcare domains -- Pharmaceutical, Providers, and Payors. Our ideas combined with customer knowledge and practice create high value in terms of cost savings and improved efficiencies.


Phenotypic Drug Discovery Data Applications

Phenotypic Drug Discovery (PDD) is rapidly changing how new drugs are discovered and developed. Maximizing the impact of PDD data has generated huge amounts of interest in pharmaceutical research in the last few years. However, there is currently a severe lack of computational and data tools that can bridge the vast amounts of traditional molecular-based data with the equally vast amounts of PDD data now being generated. We are creating applications that bring together the wealth of public and proprietary data to develop a variety of solutions for every stage of the drug discovery process.

Data Integration and Analysis

We believe pharmaceutical companies should be more like cutting edge data companies, utilizing their wealth of information to make more informed decisions at all levels. We create solutions for complex data integration challenges, helping customers combine their disparate data silos to use all their data when making decisions.


Our team of experienced healthcare professionals are dedicated to transform healthcare and create the next generation of innovations by enabling interoperable technologies to connect datasets linking patients, providers, and health plans while meeting regulatory demands. We apply our patent-pending graph pattern analytic technologies to provide strategies, improve patient care and save budget.

We differentiate from other business analytics or software systems by using data science machine learning semantic web methodologies, including identifying path patterns and linkages in the data to aggregate various data sets and create meaningful insight. Data2Discovery is using a different technique to analyze the data and extract critical information or signals to manage. Simply, we use data science to identify, predict, and prevent unnecessary outcomes ultimately transforming healthcare.

Our solution is to provide predictive analytics tools and the expertise necessary to solve your critical issues within various interlocking healthcare markets. We stay on the pulse of industry challenges and are prepared to tackle your specific needs in Revenue Cycle Management, Readmission Prevention, Risk Adjustment Management, and Population Health.

Revenue Cycle Management

Lost revenue occurs in Healthcare Systems due to unmanageable workloads and pressure to close accounts timely. By identifying trends and patterns in claims data, lost revenue can be found; and improperly billed claims can be identified early in the process to eliminate denials and decrease bad debt. The end result, a health care system’s bottom line is improved and revenue is collected.

Readmission Prevention

By prospectively identifying individuals at risk for readmission, efforts can be focused on applying resources to those who are at a higher risk for readmission. Using advanced semantic web technology and mapping data sets, we are able to include non-clinical sources and aggregate the data to provide meaningful solutions. We add to existing methodologies by including various data sources and significantly increase your predictive power.

Risk Adjustment Management

We understand Health Plans are incentivized by Medicare based on Risk Scores calculated by CMS-HCCs (Hierarchical Condition Category). Using comparative proprietary algorithms and data science machine learning techniques, combined with our industry specific healthcare knowledge, we help healthcare organizations ensure a fully optimized risk score.

Population Health & Predictive Analytics

We are committed to improve the health of the entire population by using data science semantic web machine learning technology to aggregate disparate data sources, extrapolate information, and make meaningful predictions.

A unique approach to data analytics

Current Data Analytics

Current data analytics approaches make “black box” predictions and require highly curated, homogenous data.

SEMAP™ Data Analytics

SEMAP™ can map together a wide range of data, of different types and from different sources, and therefore is able to identify indirect but statistically significant associations in data not found through other methods.


Using this mapped data, SEMAP™ can make predictions of connections between data points that are clear, interpretable, and visual - leading to insights that are understandable, actionable and based on all the available data.


Using a unique process of semantically mapping the data sources,connections can be identified that cross existing domain silos to solve a variety of complex business problems.


Implements lightweight semantic connections between datasets to leapfrog data integration headaches and allows for domain-specific knowledge to be encoded in the mapping to optimize predictions.


SEMAP™ is highly scalable to increasing data size and complexity. It is currently being applied in domains such as healthcare, accounting, and pharmaceutical research where the data sets are large and of different types.


At the core of SEMAP™ is a set of proprietary graph-based algorithms that can rapidly predict, profile, and rank statistically significant connections in large, complex, heterogeneous networks of data brought from different silos.

About Our Team


Explore team roles and bios by clicking on the network nodes.

Our team is led by highly experienced professionals within specific domains. The value and expertise we bring comes from various and applicable fields of study. Our seasoned advisors have strong financial and industry market knowledge.