Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify the uploads linked to no discernible Organization #14

Open
reefdog opened this issue Jul 18, 2019 · 4 comments
Open

Identify the uploads linked to no discernible Organization #14

reefdog opened this issue Jul 18, 2019 · 4 comments
Labels
enhancement New feature or request

Comments

@reefdog
Copy link
Contributor

reefdog commented Jul 18, 2019

The /uploads directory of the assets.priorartarchive.org S3 bucket has directories (19 total) keyed by Organization.id.

Of these directories, 11 can be linked to organizations, but eight can't. I even checked them against the v1 database, and they don't exist there either.

directory v2.Organization.id v1.Company.id slug
1b5186be-eb68-4129-9795-6983198760ac 1b5186be-eb68-4129-9795-6983198760ac prod-test-account
4b096527-aded-4ce6-9cf1-85a22c3d3ff5 4b096527-aded-4ce6-9cf1-85a22c3d3ff5 ffca7fb0-e96c-4336-9687-303e1115abff cisco
70f25e84-388b-4f0c-8c46-18ac6d32aeaa 70f25e84-388b-4f0c-8c46-18ac6d32aeaa 2039bbf8-6112-4274-a7c3-4dd2c187a036 msoftadmin
a4098829-461e-4903-b121-101d50af67af a4098829-461e-4903-b121-101d50af67af 397ecfaf-7c1c-4387-83da-81eaf9bfbbb3 delladmin
e52c43ef-c4bd-4ff1-a8c2-a392dbc90a95 e52c43ef-c4bd-4ff1-a8c2-a392dbc90a95 xeroxadmin
ae69945c-8ec2-4698-8766-6ef15abfb7be ae69945c-8ec2-4698-8766-6ef15abfb7be magic-leap
031242a6-e323-4f7d-b091-f0bb5fbd3ed2 031242a6-e323-4f7d-b091-f0bb5fbd3ed2 msj
f1aca22d-777e-4025-ae8d-b76159303310 f1aca22d-777e-4025-ae8d-b76159303310 joelgustafson
f7dd5f57-2e03-4f9f-8848-64dddf7a9b9f f7dd5f57-2e03-4f9f-8848-64dddf7a9b9f jinjoolee
a7c2498e-fdda-4767-ad10-c80505ab9fcf a7c2498e-fdda-4767-ad10-c80505ab9fcf bjoshi
5574d994-d9b9-4a65-90bd-f1d6be9b4c63 5574d994-d9b9-4a65-90bd-f1d6be9b4c63 leviton
096e402e-b66d-460a-a503-8fc5bd9524f6
4872e7bc-cea5-4e8d-abcf-a20f7905ed1b
685c745f-d0c0-4bc0-bf0d-02dc77d47674
b260f099-4698-4f2f-84bf-7637db9a5d0d
b74bcfc5-5029-444d-bb3f-06597a056cfd
d3118d8c-ae60-4a56-8781-486aa59a3f1d
d32a7b3c-6310-42c8-be70-bb8796920cf8
e166c716-ad8f-40b9-9229-b7262cbc378b

We should figure out what these are. I've generated a complete recursive list of their contents. (Note that one directory actually contains three more directories, each with only one file.)

We need to sort out what these are.

@reefdog
Copy link
Contributor Author

reefdog commented Jul 19, 2019

@metasj asked for a more readable list of the problematic files, so here's a Gist with a table of all the files already linked up, along with their timestamp and byte size. Also here's an XLSX and zipped CSV, for good measure.

(Each file path is implicitly rooted at s3://assets.priorartarchive.org/uploads.)

@metasj
Copy link

metasj commented Jul 19, 2019

I don't see the directory names in the gist -- is that all files from all 8 directories, combined?
The directory seems like the most important piece for debugging.

Most are Cisco files, by inspection; a few are test uploads.

@reefdog
Copy link
Contributor Author

reefdog commented Jul 19, 2019

@metasj The directory names are built into the path name. E.g., 096e402e-b66d-460a-a503-8fc5bd9524f6/01549568245194.pdf is the file 01549568245194.pdf within the directory 096e402e-b66d-460a-a503-8fc5bd9524f6. You can also see the eight directories in the table above; they're the last eight rows, the ones with no corresponding v2.Organization.id or v1.Company.id.

(I'll go ahead and edit the Gist so the directory is its own column though, just for clarity!)

@metasj
Copy link

metasj commented Jul 19, 2019

Got it, just hard to parse. We should have username for every account. These are perhaps users who didn't set an organization.

24f6: me
ed1b: Joel?
7674: Travis?
5d0d: cisco file + title tests
6cfd:
3f1d:
0cf8: travis
378b: cisco test?

@reefdog reefdog added this to the Post-Migration Enhancements milestone Aug 14, 2019
@reefdog reefdog added the enhancement New feature or request label Aug 14, 2019
@reefdog reefdog removed this from the Post-Migration Enhancements milestone Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants