It's been a long time since I wrote my last how2 post, but this interesting little question was brought up to me recently and I figured why not throw it up here. How would you programmatically search a string to see if it contains another string? Of course I'm doing this in C#, but the logic would be applicable to any language.
[NOTE: If you just want the quick way to do this then scroll to the bottom.]
Here's a visual of what I mean... We're looking to see if string1 (lookingFor) is contained anywhere within string2 (lookingIn). For example:
lookingFor = abcd
lookingIn = abcd OR efgh OR eabcdf OR eabfcd OR abceabfabcdghi
You can see that of all the options for lookingIn, lookingFor is found in #1, #3 and #5. Each of those options are of varying difficulty to traverse though.
abcd : Easy enough, the strings are exactly the same.
efgh : Easy enough, the string is nowhere to be found.
eabcdf : Now lookingFor is present, but it's surrounded by other letters.
eabfcd : A little more complicated...the individual letters from lookingFor are present, but the exact string is not present as there is another character between b and c.
abceabfabcdghi : Ok, now we're getting really complicated. At first we get all the way to the 3rd character from lookingFor [c] before being faced with an incorrect character. Then we start over again and get to the 2nd character. Finally, we find the full string towards the end of lookingIn.
As you can see, the cases quickly go from simple to complicated. We have to handle multiple failures while traversing the string, lookingIn, and make sure that we're finding the whole of the string, lookingFor, not just bits and pieces.
Let's see if I can explain the logic behind the code below. First of all, we want to iterate through the string we are searching through (lookingIn) to see if we can match the first letter of the string we are looking for (lookingFor). If we find the first letter then we will increase a counter [integer j below] to denote how many characters in lookingFor we have found. We then check to see if the counter is equal to the number of characters in lookingFor [the .length property]. If so, then we have found the full string and break out of the for loop. If not, then we add 1 to the counter [integer j below] and continue on to the next character in the string, lookingIn, trying to match it to the next character in lookingFor. Here's the kicker, if at any point in time the current character from lookingIn does not equal the character at the position [integer j below] in lookingFor then we reset the counter to 0 so that we start over. This is what allows case #5 above to work. It increases the counter by 1 for a, b and c then resets it to 0 when it reaches e.
Here's the full code for a console app [so you can copy/paste into your compiler and play with it]:
using System;
using System.Collections;
using System.Text;
namespace StringInString
{
class Program
{
static void Main(string[] args)
{
bool isContained = FindString(args[0].ToLower(), args[1].ToLower());
Console.WriteLine(isContained.ToString());
}
static bool FindString(string lookingFor, string lookingIn)
{
int j = 0;
for (int i = 0; i < lookingIn.Length; i++)
{
if (lookingIn[i].Equals(lookingFor[j]))
{
j++;
if (j.Equals(lookingFor.Length))
break;
}
else
{
j = 0;
}
}
if (!j.Equals(0))
return true;
else
return false;
}
}
}
Now having gone through that exercise in logic, here are two quick ways built in to .NET
:
1. Regular Expressions. Use the Regex.IsMatch method to determine if string1 is contained in string2. This method returns a boolean value. It would look like this [remember to add the using statement to the beginning of your class for System.Text.RegularExpressions]:
bool isContained = Regex.IsMatch(lookingIn, lookingFor);
2. Strings have a property called
IndexOf() that will search the string for the string2 provided and return the index of where string2 starts or -1 if it isn't found.
int indexFound = lookingIn.IndexOf(lookingFor);
Given the choice, I'll pick either of the methods built into the .NET framework every single day of the week! Why try to rebuild the wheel, especially when theirs is probably optimized for speed, efficiency and to avoid memory overflows. But, it's still interesting to work through the logic and see how the functions work.
~tod