Dataset
dataset background

Ad Hoc Services

Provision of Aggregated and Processed Patient Data for Analysis
Dataset

We provide analysis-ready dataset tailored to your needs.

EBM Medical Database

Data like DPC data, lab test data, and claim data sourced from medical institutions and insurers are processed into a database upon data anonymization. It is utilized by pharmaceutical companies, medical device companies, academia, and public research institutions for marketing research, clinical research, etc.

Dataset EBM Medical Database

Feature Overview

This service provides analysis-ready dataset for academic papers and other research projects.
Data is extracted in our standardized data format and tailored to your study design to ensure precise matching of your target patient population.

Main Survey Content in the Dataset

Data Table Overview

Data will be extracted and delivered in our standardized data format, organized under three main table headings:
Disease Name, Medical Procedures, and Laboratory Test Values.

Data Table Overview sample

Before Using the Data

Purpose Clarification

After discussing your intended research use for the dataset, we'll recommend a suitable patient cohort for extraction based on your study objectives.

下矢印

Definition Review

You'll have the opportunity to review the proposed dataset specifications, including patient cohort definitions, tables, and data columns.

下矢印

Contract

Following quotation approval, a signed contract will be necessary. We’ll submit an initial draft for your consideration.

下矢印

Delivery

Given the expected size of the dataset, we plan to deliver the files via the cloud.

FAQs

Before Purchase

Yes, we are delighted to provide these details as part of our complimentary support service.

About service

Yes, we have contributed to over 700 published papers. Please refer to our Publication for more details.
After receiving your order, it usually takes within 2 weeks.
We do not provide statistical analysis or writing services in-house, so we would outsource this to one of our trusted partners for you.
Yes, provided that all usage aligns with the originally agreed-upon scope. We encourage diverse applications within those boundaries.
Yes, it is possible. Before disclosure, we require a signed third-party data provision agreement, and a review of all materials to be disclosed.
Yes, all data can be provided in English as our database’s master version is maintained in English.

About the Hospital data

Currently, we have a coverage rate of approximately 30% of the 1,786 DPC-designated hospitals nationwide (as of December 31, 2024).
We collect insurance claim data for both inpatient and outpatient visits.
They are mainly advanced treatment hospitals, so they tend to have a higher number of patients with more severe conditions compared to clinics. However, there is also a significant number of patients in primary care areas, such as those with lifestyle-related diseases, and we receive many requests for analysis.
Our data covers DPC hospitals nationwide, and the age and gender distribution of patients is similar to the survey published by the Ministry of Health, Labour and Welfare regarding the patient distribution in medical institutions across Japan. Therefore, it is considered to be representative.
Since it is the largest database in Japan with over 50 million patients, the data has been utilized in research in different fields, including intractable and rare diseases.
Update is performed at the end of every month, and the data two months prior to the update will be the latest data. (For example, the data of January 2021 will be available after the update performed at the end of March 2021)
Given the large scale of the database, we are confident that we can secure a robust number of cases for research in any disease area.
The web tool does not currently include data fields for assessing disease severity. However, through a separate service, it is possible to obtain data fields that allow for the assessment of severity in certain disease areas.
There is a function in MDV analyzer that allows one to perform extrapolation to obtain results at the national level.
Unfortunately, our database only includes medical information that is eligible for insurance reimbursement, as it is based on claim data.
Unfortunately, our database only includes medical information that is eligible for insurance reimbursement, as it is based on claim data.

About Health insurance data

It is sourced from 223 health insurance societies across Japan. (As of January 2025)
We collect insurance claim data for both inpatient and outpatient visits.
Currently, health checkup data is not available.
Although it is not possible to link the hospital data and the health insurance data, it is possible to extract data under exactly the same conditions from each database and analyze differences in trends, for example.
Since clinic data makes up 80% of the database, we are particularly strong in areas where treatment is primarily provided at clinics.
With its high traceability, patients’ record can be traced even if they transferred to another hospital without being counted as a new patient, resulting in an outstanding advantage on creating treatment flows.
We believe this is suitable for research that does not require elderly patient, or studies on treatments between clinics and hospitals.
As long as a patient is affiliated with the same health insurance association, it can be tracked.
This response needs to be update as the extrapolication function is also working for health insurance data.
Unfortunately, our database only includes medical information that is eligible for insurance reimbursement, as it is based on claim data.

page top