{"id":613,"date":"2010-01-07T10:29:53","date_gmt":"2010-01-07T00:29:53","guid":{"rendered":"http:\/\/www.malcolmgroves.com\/blog\/?p=613"},"modified":"2015-03-13T11:02:56","modified_gmt":"2015-03-13T00:02:56","slug":"searching-in-delphi-part-1-regular-expressions","status":"publish","type":"post","link":"http:\/\/www.malcolmgroves.com\/blog\/?p=613","title":{"rendered":"Searching in Delphi Part 1 : Regular Expressions"},"content":{"rendered":"<p>Being able to find elements in your code quickly and easily is critical to being productive in any IDE. Spend too long looking for things and you start to lose your train of thought. Over the years Delphi has introduced lots of different ways to search your code, some of them simple text-based matching, some of them much more capable search engines that actually understand the structure of your code. However, I regularly meet developers who aren\u2019t aware of many of them, beyond doing a simple search using the <a href=\"http:\/\/docwiki.embarcadero.com\/RADStudio\/en\/Find\" target=\"_blank\">Search | Find<\/a> (Ctrl-F) menu option, or the same across multiple files using <a href=\"http:\/\/docwiki.embarcadero.com\/RADStudio\/en\/Find_in_Files\" target=\"_blank\">Search | Find in Files<\/a> (Shift-Ctrl-F).<\/p>\n<p><!--more-->Starting with this post I\u2019m going to try and address some of that. Each post will focus on a different way to find things in your project. I\u2019m going to assume, however, that everyone can use the basic Find and Find in Files functionality, so I won\u2019t cover that. However, I will start by covering one feature of both of those text-based searches that seems to be underused : Regular Expressions.<\/p>\n<p>When doing a text-based search, usually the more specific you can be with your search string, the fewer false matches (ie. matches that you are not actually looking for) you\u2019ll get. Problem is you often either don\u2019t know enough of the exact text around your match, or you\u2019re trying to match multiple lines with different text around your string. So you end up with a not very specific search string, and so many matches you waste a bunch of time trying to find the relevant ones, or worse, miss some relevant ones amongst the deluge.<\/p>\n<p>You\u2019re probably never going to get the perfect result of just the items you\u2019re looking for. It ends up being a trade-off between the time spent creating the search string vs the time spent wading through the results, and often a good enough balance, is, well, good enough. This is where regular expressions can help<\/p>\n<p>Let me give you a concrete example.<\/p>\n<p>The DUnit framework makes heavy use of an interface called ITest. Let\u2019s say you want to find all places in TestFramework.pas where an ITest is passed in as a parameter to a method.<\/p>\n<p><a href=\"http:\/\/www.malcolmgroves.com\/blog\/wp-content\/uploads\/2010\/01\/search1.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"margin: 0px 10px 0px 0px; display: inline; border-width: 0px;\" title=\"search1\" src=\"http:\/\/www.malcolmgroves.com\/blog\/wp-content\/uploads\/2010\/01\/search1-thumb.jpg\" alt=\"search1\" width=\"585\" height=\"401\" align=\"left\" border=\"0\" \/><\/a> Well, we can start by hitting Ctrl-F, type in ITest and hit enter and as you can see in the screeshot to the left, we get 143 results for just this one unit.<\/p>\n<p>As you should also be able to see, there are a whole bunch of results that aren\u2019t actually what we\u2019re looking for. It\u2019s matching ITestDecorator, ITestListener, even comments. Further, even where it has found ITest, it isn\u2019t just showing us where it\u2019s been used as a method parameter, it\u2019s showing us everything.<\/p>\n<p>So, how can we try to narrow down our search results? Well, we could try searching for \u201c: ITest\u201d and this gets us fewer results, but it still has problems. First, it misses places where there is no space between the colon and ITest. It also has false positives on field and variable declarations of type ITest. We could search on \u201c: ITest)\u201d which removes the field and variable declarations, but we\u2019ve now lost the instances where there are additional parameters in the method signature after the ITest parameter. Just to complicate matters further, what if the parameter is an array of ITest? We\u2019ve missed those too.<\/p>\n<p>OK, now that my strawman is suitably setup, think about how we could more accurately identify just the ITest method parameters. Perhaps if we look for a colon followed by zero or more characters, followed by ITest, followed by zero or more characters, followed by a closing bracket.<\/p>\n<p>I\u2019m a long, long way from being a regex expert, I can fairly easily remember 3 or 4 syntax elements without looking at a reference, beyond that I\u2019m struggling. However, even these few can be pretty powerful. In fact, in this example I only need to remember three things to construct my search:<\/p>\n<ul>\n<li>the dot or period character in a regex matches any single character (except line breaks). So c.t would match cat and cut, but not cart.<\/li>\n<li>the asterisk character in a regex tries to match the previous token zero or more times. t* would match tt, ttt, etc. Plus you can use them together, so .* tells it to match zero or more instances of any character. So c.*t would match cat, cut and also cart.<\/li>\n<li>Lastly, you escape characters that have special meaning in a regex (such as the asterisk and dot above) by using the backslash. Let\u2019s say you actually want to match on a dot, you\u2019d need to specify that as \\.<\/li>\n<\/ul>\n<p>That\u2019s almost everything I remember about regular expressions, but it\u2019s enough for a lot of cool searches.<\/p>\n<p>Back to my example. I said I wanted to search for:<\/p>\n<ol>\n<li>a colon<\/li>\n<li>followed by zero or more characters<\/li>\n<li>followed by ITest<\/li>\n<li>followed by zero or more characters<\/li>\n<li>followed by a closing bracket<\/li>\n<\/ol>\n<p>Based on what we\u2019ve just discussed, the regex for each of the above would look like:<\/p>\n<ol>\n<li>:<\/li>\n<li>.*<\/li>\n<li>ITest<\/li>\n<li>.*<\/li>\n<li>\\)<\/li>\n<\/ol>\n<p>Note, I had to escape the closing bracket, as it has a special meaning in a regex (I\u2019ve just looked it up as I didn\u2019t remember what it was for. It\u2019s for grouping multiple tokens)<\/p>\n<p>The actual regex looks like this :.*ITest.*\\)<\/p>\n<p><a href=\"http:\/\/www.malcolmgroves.com\/blog\/wp-content\/uploads\/2010\/01\/search2.jpg\"><img loading=\"lazy\" decoding=\"async\" style=\"margin: 0px 10px 0px 0px; display: inline; border-width: 0px;\" title=\"search2\" src=\"http:\/\/www.malcolmgroves.com\/blog\/wp-content\/uploads\/2010\/01\/search2-thumb.jpg\" alt=\"search2\" width=\"579\" height=\"261\" align=\"left\" border=\"0\" \/><\/a>To use it, you need to enable Regular Expressions in your searches. In the screen shot above it\u2019s not shown, but if you make your edit window wide enough (or click on the &gt;&gt; image next to Case Sensitive above) you\u2019ll get the option to turn on Regular Expressions. (prior to Delphi 2010 it\u2019s a checkbox on the modal dialog that pops up when you hit Ctrl-F). Then just type :.*ITest.*\\) into the same search window you normally would.<\/p>\n<p>&nbsp;<\/p>\n<p>We\u2019ve now got dramatically fewer results (around half). Note this isn\u2019t perfect, but that\u2019s not what my aim is. I was aiming for better search results, not perfect search results. In this example I\u2019ve counted 4 false positives (it still matches parameters who\u2019s type starts with ITest, such as ITestListener), but it seems to cover all the other cases I mentioned, and 69 out of 73 is a much better score than 69 out of 143.<\/p>\n<p>I could no doubt get rid of the remaining false positives, and if you\u2019re keen to try, there\u2019s a good short regex reference and also a tutorial up on <a title=\"http:\/\/www.regular-expressions.info\" href=\"http:\/\/www.regular-expressions.info\" target=\"_blank\">http:\/\/www.regular-expressions.info<\/a>. There\u2019s also a list of the syntax supported by the IDE\u2019s regex engine <a href=\"http:\/\/docwiki.embarcadero.com\/RADStudio\/en\/Regular_Expressions\" target=\"_blank\">here<\/a>. However what I have is a pretty good trade-off between results and effort. I often find if I try and make my regexes too complex, I start to spend more time debugging the search string than looking at the result. That might just be my hazy knowledge of regex syntax, however.<\/p>\n<p>How often do I use a regex in my searches? Honestly, not often. However, it\u2019s like a lot of things, once you get comfortable with it, you find yourself using it more often. Next time you find yourself with a search that is returning a very inaccurate result set, maybe it\u2019s worth considering them. I find in those occasions, with very little effort I can find what I\u2019m looking for much more quickly and accurately than otherwise.<\/p>\n<p>Next up, I\u2019ll start looking at some of the other, less well known options you have for searching your source.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Being able to find elements in your code quickly and easily is critical to being productive in any IDE. Spend too long looking for things and you start to lose your train of thought. Over the years Delphi has introduced lots of different ways to search your code, some of them simple text-based matching, some [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[19,48,62],"class_list":["post-613","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-delphi","tag-embarcadero","tag-search"],"_links":{"self":[{"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/613","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=613"}],"version-history":[{"count":4,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/613\/revisions"}],"predecessor-version":[{"id":1792,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/613\/revisions\/1792"}],"wp:attachment":[{"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=613"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=613"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.malcolmgroves.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=613"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}