Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Generic method for byte classification? #2033

Unanswered
QuarticCat asked this question in Q&A
Discussion options

In a talk, @lemire mentioned that in simdjson he used a technique to classify bytes:

  1. Construct 2 tables, each one has 16 entries, mapping a nibble (4-bit) to a byte.
  2. Split a byte to upper nibble and lower nibble.
  3. Lookup 2 nibbles in 2 tables respectively, and get 2 bytes.
  4. Bitand / bitor these 2 bytes to get the final class of the original byte.

My question is, is there a generic algorithm to construct such tables and identify situations that are not applicable?

You must be logged in to vote

Replies: 1 comment

Comment options

We call this vectorized classification. At this time, to my knowledge, it hasn't been formalized yet.

I will work on it.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.