-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having trouble processing ICD codes #5
Comments
Or does it mean icd code cannot be time dependent variables? Surely they should be allowed? |
Hello, I have just updated the code to fix this error. Please download the latest code from GitHub. You may check out an example with data containing time-dependent ICD codes here. Please try to format your data according to this example. Additionally, if the FIDDLE/tests/icd_time_test/input/config-0.yaml Lines 4 to 5 in 86b197f
|
Many thanks @shengpu1126 ! Can I please confirm with you:
|
As currently I can only use
|
There's built-in support for ICD9/ICD10 codes through icd9cms and icd10-cm packages, I believe both
I am less familiar with DRG codes. Does the DRG code of
Otherwise you should preprocess it and include the separator:
|
Many thanks, @shengpu1126 ! I have separated ICD 9 and 10 codes from the rest, and named each coding scheme uniquely, e.g.:
I then got the error below, which was strange since '645.03' is a legitimate ICD9 code that indicates "Prolonged pregnancy, antepartum condition or complication" in ICD9.
I then removed the dots as mentioned earlier but the error stayed However changing
to
and switching back to codes that have the separator in them ( I have though now encountered a new error:
After some serious digging, I have found the error traceback to line 223 in the
and in
Do you have any suggestions on how to deal with this situation pls? I am not sure what the 1s represent in |
Hi, The parser for ICD9/ICD10 relies on third-party packages that I do not have control of, so it is possible the dictionary they use is outdated and may be missing some of the codes. In that case, I agree with what you did which is to preprocess them by adding the separators. As for the issue of duplicates, the pipeline was not designed to handle duplicates. This is because for most types of EHR data like vital signs, there should not be two different values for the same patient at one point in time. There are several things you could try that may help address the error you saw:
|
Many thanks @shengpu1126 ! Looking at the last example in my previous comment, can I please ask why you have different formats for
Or, e.g. reading the final
Are they different in terms of how one should interpret them? |
Also, what would |
This is likely because some ICD codes looks like numbers and python would interpret them as numbers unless we explicitly tell it these are strings. One workaround I usually use is to prepend an underscore "123" -> "_123" so they cannot be interpreted as numbers. |
I am not using MIMIC-III or eicu data, and since this pipeline should e applicable to other EHR data sets, I am using it for in-house EHR data. No matter how I preprocess ICD codes e.g.
ICD9:V50.2
vsV50.2
vsV502
. I always encounter the error below:So my
df_types
only one icd related variable nameicd_code
which is correct. However theparse_variable_data_type
process has made a whole new list of variable names with icd at the beginning. Thus whyvariables
has a long list of "icd_code:*" elements. The whole process is very confusing and vague in details. Would you please enlighten me on the source of the error? Many thanks.The text was updated successfully, but these errors were encountered: