Comments (4)
When viewing the descMetadata datastream via the Fedora3 web admin:
Objectives:\r\n•
When looking at it from console:
Objectives:\r\nâ\u0080¢
However, note that the contents appear the same when viewing their source
and target
from within FedoraMigrate::ObjectMover
:
(byebug) source.datastreams["descMetadata"].content.split(/\n/)[5]
"<info:fedora/scholarsphere:7d279232g> <http://purl.org/dc/terms/description> \"Objectives:\\r\\nâ\u0080¢ Explain the role of a new genomic assay (Target Nowâ\u0084¢) in guiding oncology treatment plans.\\r\\nâ\u0080¢ Describe the Target Nowâ\u0084¢ assay.\\r\\nâ\u0080¢ Present a case study where Target Nowâ\u0084¢ was instrumental in the patientâ\u0080\u0099s treatment plan.\" ."
(byebug) mover.target.description.first
"Objectives:\r\nâ\u0080¢ Explain the role of a new genomic assay (Target Nowâ\u0084¢) in guiding oncology treatment plans.\r\nâ\u0080¢ Describe the Target Nowâ\u0084¢ assay.\r\nâ\u0080¢ Present a case study where Target Nowâ\u0084¢ was instrumental in the patientâ\u0080\u0099s treatment plan."
All of the above appears consistent, give than \u0080
is the Euro symbol according to:
http://www.fileformat.info/info/unicode/char/0080/index.htm
under "Java Data"
from fedora-migrate.
@mjgiarlo @jcoyne This has me stumped. The problem is the character data is coming out of Rubydora looking this way and FedoraMigrate appears to be faithfully replicating it... warts and all.
from fedora-migrate.
It looks like something may be forcing an 8859 encoding and then encoding back to UTF-8. Here's the original string:
irb(main):036:0> original_string
=> "Objectives:\r\n• Explain the role of a new genomic assay (Target Now™) in guiding oncology treatment plans.\r\n• Describe the Target Now™ assay.\r\n• Present a case study where Target Now™ was instrumental in the patient’s treatment plan."
irb(main):037:0> puts original_string
Objectives:
• Explain the role of a new genomic assay (Target Now™) in guiding oncology treatment plans.
• Describe the Target Now™ assay.
• Present a case study where Target Now™ was instrumental in the patient’s treatment plan.
=> nil
irb(main):038:0> original_string.encoding
=> #<Encoding:UTF-8>
irb(main):039:0> original_string.bytes
=> [79, 98, 106, 101, 99, 116, 105, 118, 101, 115, 58, 13, 10, 226, 128, 162, 32, 32, 69, 120, 112, 108, 97, 105, 110, 32, 116, 104, 101, 32, 114, 111, 108, 101, 32, 111, 102, 32, 97, 32, 110, 101, 119, 32, 103, 101, 110, 111, 109, 105, 99, 32, 97, 115, 115, 97, 121, 32, 40, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 226, 132, 162, 41, 32, 105, 110, 32, 103, 117, 105, 100, 105, 110, 103, 32, 111, 110, 99, 111, 108, 111, 103, 121, 32, 116, 114, 101, 97, 116, 109, 101, 110, 116, 32, 112, 108, 97, 110, 115, 46, 13, 10, 226, 128, 162, 32, 32, 68, 101, 115, 99, 114, 105, 98, 101, 32, 116, 104, 101, 32, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 226, 132, 162, 32, 97, 115, 115, 97, 121, 46, 13, 10, 226, 128, 162, 32, 32, 80, 114, 101, 115, 101, 110, 116, 32, 97, 32, 99, 97, 115, 101, 32, 115, 116, 117, 100, 121, 32, 119, 104, 101, 114, 101, 32, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 226, 132, 162, 32, 119, 97, 115, 32, 105, 110, 115, 116, 114, 117, 109, 101, 110, 116, 97, 108, 32, 105, 110, 32, 116, 104, 101, 32, 112, 97, 116, 105, 101, 110, 116, 226, 128, 153, 115, 32, 116, 114, 101, 97, 116, 109, 101, 110, 116, 32, 112, 108, 97, 110, 46]
And here's how things look when I force 8859 then encode as UTF-8:
irb(main):044:0> forced = original_string.force_encoding(Encoding::ISO8859_1).encode(Encoding::UTF_8)
=> "Objectives:\r\nâ\u0080¢ Explain the role of a new genomic assay (Target Nowâ\u0084¢) in guiding oncology treatment plans.\r\nâ\u0080¢ Describe the Target Nowâ\u0084¢ assay.\r\nâ\u0080¢ Present a case study where Target Nowâ\u0084¢ was instrumental in the patientâ\u0080\u0099s treatment plan."
irb(main):045:0> puts forced
Objectives:
� Explain the role of a new genomic assay (Target Now�) in guiding oncology treatment plans.
� Describe the Target Now� assay.
� Present a case study where Target Now� was instrumental in the patient�s treatment plan.
=> nil
irb(main):046:0> forced.encoding
=> #<Encoding:UTF-8>
irb(main):047:0> forced.bytes
=> [79, 98, 106, 101, 99, 116, 105, 118, 101, 115, 58, 13, 10, 195, 162, 194, 128, 194, 162, 32, 32, 69, 120, 112, 108, 97, 105, 110, 32, 116, 104, 101, 32, 114, 111, 108, 101, 32, 111, 102, 32, 97, 32, 110, 101, 119, 32, 103, 101, 110, 111, 109, 105, 99, 32, 97, 115, 115, 97, 121, 32, 40, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 195, 162, 194, 132, 194, 162, 41, 32, 105, 110, 32, 103, 117, 105, 100, 105, 110, 103, 32, 111, 110, 99, 111, 108, 111, 103, 121, 32, 116, 114, 101, 97, 116, 109, 101, 110, 116, 32, 112, 108, 97, 110, 115, 46, 13, 10, 195, 162, 194, 128, 194, 162, 32, 32, 68, 101, 115, 99, 114, 105, 98, 101, 32, 116, 104, 101, 32, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 195, 162, 194, 132, 194, 162, 32, 97, 115, 115, 97, 121, 46, 13, 10, 195, 162, 194, 128, 194, 162, 32, 32, 80, 114, 101, 115, 101, 110, 116, 32, 97, 32, 99, 97, 115, 101, 32, 115, 116, 117, 100, 121, 32, 119, 104, 101, 114, 101, 32, 84, 97, 114, 103, 101, 116, 32, 78, 111, 119, 195, 162, 194, 132, 194, 162, 32, 119, 97, 115, 32, 105, 110, 115, 116, 114, 117, 109, 101, 110, 116, 97, 108, 32, 105, 110, 32, 116, 104, 101, 32, 112, 97, 116, 105, 101, 110, 116, 195, 162, 194, 128, 194, 153, 115, 32, 116, 114, 101, 97, 116, 109, 101, 110, 116, 32, 112, 108, 97, 110, 46]
This is what we're doing here, right? https://github.com/projecthydra-labs/fedora-migrate/blob/master/lib/fedora_migrate/rdf_datastream_mover.rb#L33
from fedora-migrate.
@mjgiarlo yes, this is what @jcoyne did to correct for objects that were throwing RDF errors. See #9
from fedora-migrate.
Related Issues (20)
- Treating missing objects
- License? HOT 1
- Non-Copy Datastream Migrations HOT 1
- FedoraMigrate::ContentMover has risky original_name logic HOT 5
- Copy over label
- How to migrate fedora 3 data which does not have 'active_fedora_model'? HOT 4
- FedoraMigrate.migrate_repository() without options parameters results in error
- FedoraMigrate::FileConfigurator cannot find 'get_config_path' method
- Non-RDF datastreams not defined as attached_files in F4 model won't get migrated HOT 1
- Migrate an object using customized logic to create id in Fedora 4 HOT 1
- Converting more than one RDF datastream HOT 1
- Should FedoraMigrate handle PCDM migrations HOT 4
- Hydra 10 compatibility / use CurationConcerns in place of Hydra::Collections HOT 1
- Use fcrepo_wrapper for Fedora 4
- Error migrating datastreams greater than 2 GB in size HOT 4
- Rename master branch to main
- RENAME: Recommendation for branch name testing - add CircleCI or update documentation
- RENAME: Add language to README about branch naming
- RENAME: Update CONTRIBUTING.md to match the maintenance template
- RENAME: Update references of hard-coded legacy master branch name to main branch name
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fedora-migrate.