GithubHelp home page GithubHelp logo

Comments (12)

JonathanDCohen avatar JonathanDCohen commented on May 18, 2024

Which character was the compiler complaining about?

from abseil-cpp.

gavxin avatar gavxin commented on May 18, 2024

I found two position.

unaligned_access.h the two double quotes are unicode chars.

// “ARMv7 or higher”, so we have to filter away all ARMv5 and ARMv6

absl/strings/str_split_test.cc the test case TEST(Split, UTF8) is testing utf-8 string splitting. It's just test case so maybe no need to change.

TEST(Split, UTF8) {
// Tests splitting utf8 strings and utf8 delimiters.
{
// A utf8 input std::string with an ascii delimiter.
std::vector<absl::string_view> v = absl::StrSplit("a,κόσμε", ',');
EXPECT_THAT(v, ElementsAre("a", "κόσμε"));
}
{
// A utf8 input std::string and a utf8 delimiter.
std::vector<absl::string_view> v = absl::StrSplit("a,κόσμε,b", ",κόσμε,");
EXPECT_THAT(v, ElementsAre("a", "b"));
}
{
// A utf8 input std::string and ByAnyChar with ascii chars.
std::vector<absl::string_view> v =
absl::StrSplit("Foo hällo th丞re", absl::ByAnyChar(" \t"));
EXPECT_THAT(v, ElementsAre("Foo", "hällo", "th丞re"));
}
}

After modify these two file, the build success.

from abseil-cpp.

JonathanDCohen avatar JonathanDCohen commented on May 18, 2024

Since Abseil is fairly low-level, instead of forcing utf-8 in msvc builds, I think the correct move is to make our source files all ascii. So I'll write up a patch to change that test to use unicode escape characters, which should make it agnostic to file encoding.

That being said, Abseil files are utf8 internally, so I'm not sure what other monsters there may be lurking if someone tries to use abseil source with other encodings.

from abseil-cpp.

gavxin avatar gavxin commented on May 18, 2024

It make sense, it's better make source file as ascii than forcing utf-8 for msvc.

Thanks

from abseil-cpp.

manshreck avatar manshreck commented on May 18, 2024

Note that the test is actually testing utf8 characters. I don't see how we can have a test for utf8 in ASCII ...

We should definitely prefer ASCII for source files, IMO, but I'm not sure requiring them to be ASCII is the best option.

from abseil-cpp.

mbxx avatar mbxx commented on May 18, 2024

Jon and I chatted about this yesterday, and I came to the conclusion that Abseil files should be ASCII. Abseil strives to be as broadly usable as possible. Usually this takes the form of writing code that works on a wide variety of platforms, dodging bugs in those platforms if necessary. Here it means avoiding UTF-8 because some toolchains we want to support can't handle it.

The alternative isn't as bad as I feared: escaping the unicode characters in the test strings lets them be UTF-8 test strings that are specified using only ASCII characters in the source code. I'm a little sad, because suddenly it's no longer clear at a glance that the escaped characters are well-formed UTF-8, but this feels like the right thing to do to me.

from abseil-cpp.

jorgbrown avatar jorgbrown commented on May 18, 2024

The MSVC error message says "Save the file in Unicode format to prevent data loss"

Just curious: What does it mean to save the file in Unicode format? Does it just add a UTF-8 BOM to the beginning of the file? Or is there additional metadata in the filesystem to tell MSVC that the file is Unicode?

from abseil-cpp.

JonathanDCohen avatar JonathanDCohen commented on May 18, 2024

"Unicode format" seems ambiguous. Anyways it looks like my internal patch to remove non-ascii characters has been upstreamed, and ASCII fits in all flavors of Unicode. Is this problem still persisting?

from abseil-cpp.

gavxin avatar gavxin commented on May 18, 2024

Try to build 0ec11ba , str_split_test.cc still cause following warnings.

ERROR: E:/xx/abseil-cpp/absl/strings/BUILD.bazel:214:1: C++ compilation of rule '//absl/strings:str_split_test' failed (Exit 2)
absl/strings/str_split_test.cc(624): error C2220: warning treated as error - no 'object' file generated
absl/strings/str_split_test.cc(624): warning C4566: character represented by universal-character-name '\u1F79' cannot be represented in the current code page (936)
absl/strings/str_split_test.cc(644): warning C4566: character represented by universal-character-name '\u00E4' cannot be represented in the current code page (936)
absl/strings/str_split_test.cc(645): warning C4566: character represented by universal-character-name '\u00E4' cannot be represented in the current code page (936)

It just test, maybe I should ignore it.

from abseil-cpp.

ahedberg avatar ahedberg commented on May 18, 2024

I'm looking into fixing the str_split_test.cc failures.

from abseil-cpp.

ahedberg avatar ahedberg commented on May 18, 2024

This has been fixed internally; just waiting for the change to be propagated to GitHub.

from abseil-cpp.

mbxx avatar mbxx commented on May 18, 2024

I just pushed the change to GitHub.

from abseil-cpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.