Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible error with regulatory_region subfeature #4794

Open
cmdcolin opened this issue Jan 25, 2025 · 4 comments
Open

Possible error with regulatory_region subfeature #4794

cmdcolin opened this issue Jan 25, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@cmdcolin
Copy link
Collaborator

Reported by user on gitter

NC_007112.7	BestRefSeq	gene	24355021	24360249	.	+	.	ID=gene-rps3a;Dbxref=GeneID:337240,ZFIN:ZDB-GENE-030131-9184;Name=rps3a;description=ribosomal protein S3A;gbkey=Gene;gene=rps3a;gene_biotype=protein_coding;gene_synonym=fb02h01,rpS3Ae,wu:fb02h01,zgc:73195,zgc:86672
NC_007112.7	BestRefSeq	regulatory_region	24355021	24355520	.	+	.	ID=promoter-rna-NM_200059.2;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;Name=NM_200059.2;gbkey=promoter;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=promoter
NC_007112.7	BestRefSeq	mRNA	24355521	24360249	.	+	.	ID=rna-NM_200059.2;Parent=gene-rps3a;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;Name=NM_200059.2;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	exon	24355521	24355607	.	+	.	ID=exon-NM_200059.2-1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	CDS	24355546	24355607	.	+	0	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1
NC_007112.7	BestRefSeq	regulatory_region	24355608	24356109	.	+	.	ID=intron-NM_200059.2-1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=intron;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=intron
NC_007112.7	BestRefSeq	CDS	24356110	24356213	.	+	1	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1
NC_007112.7	BestRefSeq	exon	24356110	24356213	.	+	.	ID=exon-NM_200059.2-2;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	regulatory_region	24356214	24356705	.	+	.	ID=intron-NM_200059.2-2;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=intron;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=intron
NC_007112.7	BestRefSeq	exon	24356706	24356893	.	+	.	ID=exon-NM_200059.2-3;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	CDS	24356706	24356893	.	+	2	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1
NC_007112.7	BestRefSeq	regulatory_region	24356894	24357913	.	+	.	ID=intron-NM_200059.2-3;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=intron;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=intron
NC_007112.7	BestRefSeq	exon	24357914	24358122	.	+	.	ID=exon-NM_200059.2-4;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	CDS	24357914	24358122	.	+	0	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1
NC_007112.7	BestRefSeq	regulatory_region	24358123	24358462	.	+	.	ID=intron-NM_200059.2-4;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=intron;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=intron
NC_007112.7	BestRefSeq	exon	24358463	24358572	.	+	.	ID=exon-NM_200059.2-5;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	CDS	24358463	24358572	.	+	1	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1
NC_007112.7	BestRefSeq	regulatory_region	24358573	24359477	.	+	.	ID=intron-NM_200059.2-5;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=intron;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2;regulatory_class=intron
NC_007112.7	BestRefSeq	exon	24359478	24360249	.	+	.	ID=exon-NM_200059.2-6;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NM_200059.2,ZFIN:ZDB-GENE-030131-9184;gbkey=mRNA;gene=rps3a;product=ribosomal protein S3A;transcript_id=NM_200059.2
NC_007112.7	BestRefSeq	CDS	24359478	24359608	.	+	2	ID=cds-NP_956353.1;Parent=rna-NM_200059.2;Dbxref=GeneID:337240,Genbank:NP_956353.1,ZFIN:ZDB-GENE-030131-9184;Name=NP_956353.1;gbkey=CDS;gene=rps3a;product=40S ribosomal protein S3a;protein_id=NP_956353.1

@cmdcolin cmdcolin added the bug Something isn't working label Jan 25, 2025
@cmdcolin
Copy link
Collaborator Author

note that the NCBI GFF for danRer11 does not contain regulatory_region features https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000002035.6/ this seems like maybe a custom workflow from the gitter user

@jacobscgc
Copy link

Hi Colin,

Indeed the regulatory_region is from a custom workflow I created. They represent promoter or intronic regions.
I tried setting the type to 'promoter' and 'intron' but that actually results in the exact same display issue. Specifically, trying to display the 'promoter' region causes a shift in where the features are displayed. Trying to display 'intron' does not seem to do much.

Maybe I am unaware of other types to be used in these cases.

Chris

@cmdcolin
Copy link
Collaborator Author

@jacobscgc if you have a specific type of rendering that you are trying to achieve e.g. a mockup or screenshot it might help me to see

i could try to comment on the above but seeing a mockup might help

@jacobscgc
Copy link

Alright, I gave it a try to make a mockup of what I am trying to achieve, hopefully it is a bit clear.
Basically, the top track shows the default display of the gff, showing CDS and UTR where exons exceed the size of the CDS.
They are connected by black lines, which represent introns but are not defined explicitly.

Of course in this custom GFF I do explicitly state where promoter/introns are by providing them as regulatory_region.
In the bottom track, I am only displaying this regulatory_region. Here, the black lines are basically the exons. And as can be seen, they do not line up. When displaying regulatory_region, there is a clear shift to the left.

Then, I drew a representation of what I was trying to achieve (with powerpoint). In short, just the same presentation as the default one, but the regulatory_region represented as a slightly thicker line which can be assigned a color. I made it orange here.

If I need to clarify please let me know.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants