Conditional Matches aka Checking If A String Contains Another String In Python
-- comment
-- p
References have been moved up, the post it self is
assembled further down.
-- p
-- p
-- p
These are the basic ``in`` and ``re.search()`` patterns to
use most of the time. There is also ``.find()`` and
``.index()`` which are listed further down.
-- p
Basic find via ``in`` operator
-- python
sentence = "the quick brown fox"
word = "quick"
if word in sentence:
print("hit")
else:
print("miss")
-- p
#+begin_example
-- p
hit
-- p
#+end_example
-- p
Basic regular expression match:
-- python
import re
sentence = "the quick 9000 brown fox"
expression = r"quick \d\d\d\d"
if re.search(expression, sentence):
print("hit")
else:
print("miss")
-- p
#+begin_example
-- p
hit
-- p
#+end_example
-- p
#+POST
-- p
There are a few different ways to see if one python
string contains another one.
-- p
The short version is:
-- p
- Use the ``in`` operator if you don't need
to match a regular expression
- Use ``re.search()`` if you do
-- p
It's also possible to use ``.find()`` and `.index()`.
Those provide details on _where_ the search string
exists inside the larger one if you need it.
-- p
Here's how the different methods look:
-- p
** Using The ``in`` Operator
-- p
The ``in`` operator works like this:
-- p
<<<src_in>>>
-- p
** Using ``re.search()`` With A String
-- p
This method uses the ``re`` module for regular expressions.
You can search for basic text like this:
-- python
import re
sentence = "the quick brown fox"
word = "quick"
if re.search(word, sentence):
print("hit")
else:
print("miss")
-- results/
hit
-- /results
-- p
** Using ``re.search()`` With A Regular Expression
-- p
The ``re.search()`` method can be used with full
regular expressions as well. (TKTKTKT
add details about the ``r`` prefix.)
-- p
This example looks for the word "quick" followed
by a space and then four digits.
-- p
<<<src_regex>>>
-- p
** Using `.find()`
-- p
Find is a little different in that it will return the index
of the first match if it finds one. Since that index can be the
first character of the string (which has an index of zero) `.find()`
returns ``-1`` if it doesn't hit a match. That requires
checking for the value explicitly which looks like this:
-- python
sentence = "the quick brown fox"
word = "quick"
result = sentence.find(word)
if result != -1:
print(f"hit at index: {result}")
else:
print("miss")
-- results/
hit at index: 4
-- /results
-- p
** Using `.index()`
-- p
Using ``.index()`` is like using ``.find()`` in that it
returns the postion of the match. The difference is
that it throws an error if now match is found. Dealing
with that requires a try/except/else block:
-- python
sentence = "the quick brown fox"
word = "quick"
try:
result = sentence.index(word)
except ValueError:
print("miss")
else:
print(f"hit at index: {result}")
-- results/
nil
-- /results
-- p
** Notes
-- p
- The ``re`` module also has a ``.match()`` method that works
like ``.search()`` except that it only matches from the
start of the string. I never use that. If I need to
match from the start of the string I use the `^`
character in the regular expression (e.g. `r"^first
words"`)
-- end of line --