Chris M 55 Posted October 13, 2014 Report Share Posted October 13, 2014 I have a list of urls saved to a variable. What I want to do is remove any urls that contain a number or a hyphens. What is the best way to do this? Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 13, 2014 Report Share Posted October 13, 2014 Here ya go! alert($trim($replace regular expression("boob.com/1231googga.com/123-546mama.comchoochoo.net", "(.*)(-|[0-9])", $nothing))) Regex magic TC Quote Link to post Share on other sites
Chris M 55 Posted October 13, 2014 Author Report Share Posted October 13, 2014 Sick. Thank you TC Quote Link to post Share on other sites
Chris M 55 Posted October 13, 2014 Author Report Share Posted October 13, 2014 the urls would be site0.com or site-test.com and those are the types i want to remove from the list. Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 13, 2014 Report Share Posted October 13, 2014 .*[0-9]\..*|.*-.* siite0.com564-site.aolgoogle.comsite-2.comsite-three.com Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 13, 2014 Report Share Posted October 13, 2014 this concept should work too loop($list total(%url from file)) { set(#NLI, $next list item(%url from file), "Global") if($plugin function("StringManagementPlugin.dll", "$SMP IsEmpty", $find regular expression(#NLI, "-|[0-9]"))) { then { add item to list(%cleaned urls, #NLI, "Delete", "Global") } else { } }} Quote Link to post Share on other sites
Chris M 55 Posted October 13, 2014 Author Report Share Posted October 13, 2014 That is still not removing the urls with numbers or hyphens. So weird. Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 13, 2014 Report Share Posted October 13, 2014 final cut list was 80k soo, needed the big guns this is for the others Chris plugin command("Bigtable.dll", "Clear all large list")plugin command("Bigtable.dll", "Large list from file", "LFF", "{$special folder("Application")}\\10-13-2014.txt")ui stat monitor("LFF: {$plugin function("Bigtable.dll", "Large list total", "LFF")}", "cleaned: {$plugin function("Bigtable.dll", "Large list total", "clean")}list pos: {#index}")set(#index, 0, "Global")loop($plugin function("Bigtable.dll", "Large list total", "LFF")) { set(#NLI, $plugin function("Bigtable.dll", "Large list item", "LFF", #index), "Global") if($plugin function("StringManagementPlugin.dll", "$SMP IsEmpty", $find regular expression(#NLI, "-|[0-9]"))) { then { plugin command("Bigtable.dll", "Add item to large list", "clean", #NLI) } else { } } increment(#index)}save to file("{$special folder("Application")}\\clean.txt", $plugin function("Bigtable.dll", "Large list return", "clean"))TC Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 13, 2014 Report Share Posted October 13, 2014 here is the first version in large data and no loop fixed regex too plugin command("Diagnosticfunctions.dll", "Stop Watch Reset")plugin command("Diagnosticfunctions.dll", "Stop Watch Start")plugin command("Bigtable.dll", "Clear all large list")plugin command("Bigtable.dll", "Large list from file", "LFF", "{$special folder("Application")}\\10-13-2014.txt")plugin command("Bigtable.dll", "large List from Regex", "bad urls", $plugin function("Bigtable.dll", "Large list return", "LFF"), ".*(\\d|-).*\\..*", "replace")plugin command("Bigtable.dll", "Compare large list", "LFF", "bad urls", "Remove duplicates")save to file("{$special folder("Application")}\\regexed--clean.txt", $plugin function("Bigtable.dll", "Large list return", "LFF"))plugin command("Diagnosticfunctions.dll", "Stop Watch Stop")ui stat monitor("LFF: {$plugin function("Bigtable.dll", "Large list total", "LFF")}", "cleaned: {$plugin function("Bigtable.dll", "Large list total", "bad urls")}list pos: {#index}timer: {$plugin function("Diagnosticfunctions.dll", "Stop watch elasped time")}") attached is the ubot file with both versions and timers you will see why I tried without loop first. loop 30 secondsno loop 3 seconds yes, I did it U5.5...I am proud to say get your own listyou can make your own with "$rand text".com then put some real domains in. here are some siite0.com564-site.aolgoogle.comsite-2.comsite-three.combing.net0-0auto.com7545hgjhg.com546jhk-jguy.ini TCchris url filter.ubot 1 Quote Link to post Share on other sites
Bill 7 Posted October 13, 2014 Report Share Posted October 13, 2014 This works for me. set(#urls, "siite0.comsiite0s.com564-site.aolgoogle.comsite-2.comsite-three.comsiiste0.com564-site.aolkdh.comsite-22.comsite-three.com1siTe.comabc.comab2c.coma1a.coma1a-az.comabc.com/33asfdadfs.comafaf43asv.com12a.com123.comafdr.comrty.com/67", "Global")alert(#urls)clear list(%new_urls)add list to list(%new_urls, $list from text($replace regular expression(#urls, ".*[0-9]\\\\..*|.*-.*|.*-.*|\\d.*|.*\\d.*", $nothing), $new line), "Delete", "Global")alert(%new_urls) 1 Quote Link to post Share on other sites
UBotBuddy 331 Posted October 14, 2014 Report Share Posted October 14, 2014 Well...I got to looking and I figured out another regex expression as well as using the "$find regular expression" node. So check it out. This technique looks for what to keep vs. looking at what to eliminate. set(#urls, "siite0.com siite0s.com 564-site.aol google.com site-2.com site-three.com siiste0.com 564-site.aol kdh.com site-22.com site-three.com 1siTe.com abc.com ab2c.com a1a.com a1a-az.com abc.com/33 asfdadfs.com afaf43asv.com 12a.com 123.com afdr.com rty.com/67 ", "Global") alert(#urls) clear list(%new_urls) add list to list(%new_urls, $list from text($find regular expression(#urls, "\\s[a-zA-Z]\{1,\}\\.[a-zA-Z]\{1,3\}\\s"), $new line), "Delete", "Global") alert(%new_urls) Buddy 2 Quote Link to post Share on other sites
Chris M 55 Posted October 16, 2014 Author Report Share Posted October 16, 2014 Great job Buddy Quote Link to post Share on other sites
HelloInsomnia 1103 Posted October 17, 2014 Report Share Posted October 17, 2014 The file management plugin has a command for this: plugin command("File Management.dll", "remove all from list", %urls, "(\\d|-)") 1 Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 18, 2014 Report Share Posted October 18, 2014 The file management plugin has a command for this: plugin command("File Management.dll", "remove all from list", %urls, "(\\d|-)") Many hidden gems indeed!! Guys, what is not evident is that these lists are in the 80k range. All these solutions are useable for small list that Ubot can handle. Keep in mind that size does matter here, hehe. 1 Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 18, 2014 Report Share Posted October 18, 2014 here, I made a bot to make random urls like the ones for this thread. Like I mention before these list are 80k ish and not mentioned in OP. So, try your solutions with an 80k list or even more. You can make as many as you want. Not the fastest maker but I don't think there was a need to make it fast. Again your solutions are great. TCrandom-url-maker.ubot Quote Link to post Share on other sites
HelloInsomnia 1103 Posted October 18, 2014 Report Share Posted October 18, 2014 Many hidden gems indeed!! Guys, what is not evident is that these lists are in the 80k range. All these solutions are useable for small list that Ubot can handle. Keep in mind that size does matter here, hehe. I just used your script to create 100k urls. My code cleaned it in like a second. It took 30 seconds just to add the list from the file but if you already have the list ready it will do 80k no problem 1 Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted October 18, 2014 Report Share Posted October 18, 2014 That is bad ass!! Have not thought about using it that way until you shared it here. Didn't have a chance to use it either. That's what this forum is about sharing and learning and I appreciate both!! Quote Link to post Share on other sites
UBotBuddy 331 Posted October 18, 2014 Report Share Posted October 18, 2014 Yep! Just depends on what size of gun you want to bring to a knife fight. lol Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.