What's the fastest way to check if a string is in a static compile-time set?
I know that hash codes are generally the fastest way to check dynamic sets, but I was wondering what is the fastest way to check whether a dynamic string is in a read-only string set known at compile-time. (I mean mainly length: usize; chars: &[u8]
strings, not ropes or cons strings.)
Currently, I'm usually doing stuff like this, but it seems like it'd be suboptimal:
// What I mean
let keywords = Set::new(["do", "if", "in", "for", "new", "try"]);
fun is_keyword(s: &str) keywords.contains(s)
// What I write
function is_keyword(s: &str)
match s.length()
2 -> s == "do"
Is there anything faster than something derived from this second variant for sets of C-style strings? Or is it as fast as I could reasonably get?
This is language-agnostic - I don't care what languages answers use. I'm just using Rust due to familiarity.
string algorithm language-agnostic
add a comment |
I know that hash codes are generally the fastest way to check dynamic sets, but I was wondering what is the fastest way to check whether a dynamic string is in a read-only string set known at compile-time. (I mean mainly length: usize; chars: &[u8]
strings, not ropes or cons strings.)
Currently, I'm usually doing stuff like this, but it seems like it'd be suboptimal:
// What I mean
let keywords = Set::new(["do", "if", "in", "for", "new", "try"]);
fun is_keyword(s: &str) keywords.contains(s)
// What I write
function is_keyword(s: &str)
match s.length()
2 -> s == "do"
Is there anything faster than something derived from this second variant for sets of C-style strings? Or is it as fast as I could reasonably get?
This is language-agnostic - I don't care what languages answers use. I'm just using Rust due to familiarity.
string algorithm language-agnostic
The fastest is a compile time generated trie, which gives O(L) time, whereL
is the length of the string you're looking for.
– user3386109
Nov 10 at 1:40
add a comment |
I know that hash codes are generally the fastest way to check dynamic sets, but I was wondering what is the fastest way to check whether a dynamic string is in a read-only string set known at compile-time. (I mean mainly length: usize; chars: &[u8]
strings, not ropes or cons strings.)
Currently, I'm usually doing stuff like this, but it seems like it'd be suboptimal:
// What I mean
let keywords = Set::new(["do", "if", "in", "for", "new", "try"]);
fun is_keyword(s: &str) keywords.contains(s)
// What I write
function is_keyword(s: &str)
match s.length()
2 -> s == "do"
Is there anything faster than something derived from this second variant for sets of C-style strings? Or is it as fast as I could reasonably get?
This is language-agnostic - I don't care what languages answers use. I'm just using Rust due to familiarity.
string algorithm language-agnostic
I know that hash codes are generally the fastest way to check dynamic sets, but I was wondering what is the fastest way to check whether a dynamic string is in a read-only string set known at compile-time. (I mean mainly length: usize; chars: &[u8]
strings, not ropes or cons strings.)
Currently, I'm usually doing stuff like this, but it seems like it'd be suboptimal:
// What I mean
let keywords = Set::new(["do", "if", "in", "for", "new", "try"]);
fun is_keyword(s: &str) keywords.contains(s)
// What I write
function is_keyword(s: &str)
match s.length()
2 -> s == "do"
Is there anything faster than something derived from this second variant for sets of C-style strings? Or is it as fast as I could reasonably get?
This is language-agnostic - I don't care what languages answers use. I'm just using Rust due to familiarity.
string algorithm language-agnostic
string algorithm language-agnostic
asked Nov 10 at 0:56
Isiah Meadows
1,0431324
1,0431324
The fastest is a compile time generated trie, which gives O(L) time, whereL
is the length of the string you're looking for.
– user3386109
Nov 10 at 1:40
add a comment |
The fastest is a compile time generated trie, which gives O(L) time, whereL
is the length of the string you're looking for.
– user3386109
Nov 10 at 1:40
The fastest is a compile time generated trie, which gives O(L) time, where
L
is the length of the string you're looking for.– user3386109
Nov 10 at 1:40
The fastest is a compile time generated trie, which gives O(L) time, where
L
is the length of the string you're looking for.– user3386109
Nov 10 at 1:40
add a comment |
2 Answers
2
active
oldest
votes
For a static set, you can use perfect hashing. This is essentially a hash table, but the hash function guarantees that every string in the set hashes to a unique index in the table.
To test a dynamic string, you just hash it to an index using the perfect hash function, and then see if the one and only string at that index matches the test string.
A google search will find you lots of different ways to do perfect hashing. One of my favorites is described here: http://cmph.sourceforge.net/papers/chm92.pdf
It's often used for keyword matching in compilers, or implementing switch/case on strings in languages that support that.
add a comment |
Like you said, seems like the fastest way would be to hash the strings. Your current way would take O(N) time to search for the largest string in the set, or for a string that isn't in the set at all.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235091%2fwhats-the-fastest-way-to-check-if-a-string-is-in-a-static-compile-time-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
For a static set, you can use perfect hashing. This is essentially a hash table, but the hash function guarantees that every string in the set hashes to a unique index in the table.
To test a dynamic string, you just hash it to an index using the perfect hash function, and then see if the one and only string at that index matches the test string.
A google search will find you lots of different ways to do perfect hashing. One of my favorites is described here: http://cmph.sourceforge.net/papers/chm92.pdf
It's often used for keyword matching in compilers, or implementing switch/case on strings in languages that support that.
add a comment |
For a static set, you can use perfect hashing. This is essentially a hash table, but the hash function guarantees that every string in the set hashes to a unique index in the table.
To test a dynamic string, you just hash it to an index using the perfect hash function, and then see if the one and only string at that index matches the test string.
A google search will find you lots of different ways to do perfect hashing. One of my favorites is described here: http://cmph.sourceforge.net/papers/chm92.pdf
It's often used for keyword matching in compilers, or implementing switch/case on strings in languages that support that.
add a comment |
For a static set, you can use perfect hashing. This is essentially a hash table, but the hash function guarantees that every string in the set hashes to a unique index in the table.
To test a dynamic string, you just hash it to an index using the perfect hash function, and then see if the one and only string at that index matches the test string.
A google search will find you lots of different ways to do perfect hashing. One of my favorites is described here: http://cmph.sourceforge.net/papers/chm92.pdf
It's often used for keyword matching in compilers, or implementing switch/case on strings in languages that support that.
For a static set, you can use perfect hashing. This is essentially a hash table, but the hash function guarantees that every string in the set hashes to a unique index in the table.
To test a dynamic string, you just hash it to an index using the perfect hash function, and then see if the one and only string at that index matches the test string.
A google search will find you lots of different ways to do perfect hashing. One of my favorites is described here: http://cmph.sourceforge.net/papers/chm92.pdf
It's often used for keyword matching in compilers, or implementing switch/case on strings in languages that support that.
edited Nov 10 at 2:43
answered Nov 10 at 2:29
Matt Timmermans
18.6k11532
18.6k11532
add a comment |
add a comment |
Like you said, seems like the fastest way would be to hash the strings. Your current way would take O(N) time to search for the largest string in the set, or for a string that isn't in the set at all.
add a comment |
Like you said, seems like the fastest way would be to hash the strings. Your current way would take O(N) time to search for the largest string in the set, or for a string that isn't in the set at all.
add a comment |
Like you said, seems like the fastest way would be to hash the strings. Your current way would take O(N) time to search for the largest string in the set, or for a string that isn't in the set at all.
Like you said, seems like the fastest way would be to hash the strings. Your current way would take O(N) time to search for the largest string in the set, or for a string that isn't in the set at all.
answered Nov 10 at 1:01
jahneff
433
433
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235091%2fwhats-the-fastest-way-to-check-if-a-string-is-in-a-static-compile-time-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The fastest is a compile time generated trie, which gives O(L) time, where
L
is the length of the string you're looking for.– user3386109
Nov 10 at 1:40