I have noticed that their are many questions in the forum that deal with
1) list exceeded errors
2) removing words, urls, phrases from lists.
I have personally struggled with this basic list management challenges
So I have done an experiment and wrote a code block for a sample to start with and am hoping the experts can add their valued input to this thread to get a working example that does what it is supposed to as my attempt failed dismally.
1) I wanted to manage the "list exceeded challenge"
2) I wanted my phrase match to be accurate.
The challenge here is to refine the code sample on my findings, so that we has a standard block of code that can be used and modify.
In the experiment I wanted to remove all urls from a list that did not contain the Phrase "content-nation", "content-curation", "curation".
In 4 examples i discovered that something was going weird and I am thinking I may not clearly understand what the CONTAINS command does and would like some feed back from the pros on what they think about the findings and possible enhancements to the code so the members of the forum can use these blocks of code for their bots.
Hopefully we get some great samples and comments.
So the bot is attached and here is the source to look at:
ui text box("Search Parameter", #SearchStringDashed)
set(#SearchStringDashed, $trim(#SearchStringDashed), "Global")
set(#SearchStringDashed, $change text casing(#SearchStringDashed, "Lower Case"), "Global")
ui stat monitor("Original List count: ", #UrlsCountoriginal)
ui stat monitor("Processed List: ", $list total(%Urls))
set(#Urls, "/Curation
/Digital-Curation
/Social-Curation
/Content-Curation-1
/Data-Curation
/Store-Curation
/Can-content-curation-become-mainstream
/Who-curates-the-curators
/Web-Content-Curation-Applications
/Web-Content-Curation-Startups
/search?q=curation&context_type=&context_id=
#
#
/Why-isnt-the-National-Museum-of-the-American-Indian-more-like-the-Holocaust-Museum
/National-Geographic-1
/National-Football
/United-Nations
/Nations
/The-National-band
/Live-Nation
/National-Public-Radio
/Bling-Nation
/Washington-Nationals
/search?q=curation+nation&context_type=&context_id=
#
#
/Content-Curator
/Content-Curation-1
/Can-content-curation-become-mainstream
/Web-Content-Curation-Applications
/Web-Content-Curation-Startups
/What-are-the-best-content-curation-tools-for-daily-use
/When-does-content-curation-become-content-creation
/What-is-web-content-curation
/Web-Content-Curation/Can-you-make-money-curating
/Web-Content-Curation/Who-are-the-best-tech-content-curators-to-follow-in-2011
/search?q=content+curation&context_type=&context_id=
#
#", "Global")
comment("grab urls for questions")
add list to list(%Urls, $list from text(#Urls, $new line), "Delete", "Global")
set(#UrlsCountoriginal, $list total(%Urls), "Global")
set list position(%Urls, 0)
set(#FirststLevelQ, 0, "Global")
set(#FirststLevelQ_total, $list total(%Urls), "Global")
loop(#FirststLevelQ_total) {
if($comparison(#FirststLevelQ_total, ">", #FirststLevelQ)) {
then {
if($contains($change text casing($list item(%Urls, #FirststLevelQ), "Lower Case"), #SearchStringDashed)) {
then {
}
else {
remove from list(%Urls, #FirststLevelQ)
decrement(#FirststLevelQ)
decrement(#FirststLevelQ_total)
}
}
increment(#FirststLevelQ)
if($contains($list item(%Urls, #FirststLevelQ), "#")) {
then {
remove from list(%Urls, #FirststLevelQ)
decrement(#FirststLevelQ)
decrement(#FirststLevelQ_total)
}
else {
}
}
increment(#FirststLevelQ)
}
else {
}
}
}
save to file("{$special folder("Desktop")}\\{#SearchStringDashed}-results.txt", %Urls)
alert("Completed!")
stop script
In this experiment I am looking for url that only contain a keywords in phrase match and not broad match.
I have encoded 30 items in the list.
I have added random # to represent odd characters we may want to remove from the list as well.
The search parameter must use a - (dash) as the urls have dashes between the words and we looking to find the urls where the phrase is in the url.
Experiment 1: search on "content-curation" - you can see the original list in source provided
It returned 19 results.
Findings:
1) all results had either "content" or "curation" - not desired result
2) all the "#" signs were removed - completed successfully
3) the following urls should most definitely not be present
/National-Geographic-1
/United-Nations
/The-National-band
/National-Public-Radio
/Washington-Nationals
Experiment 2: search on "curation-nation"
returned 14 results
1) all results had either "curation" or "Nation" - not desired result
2) all the "#" signs were removed - completed successfully
3) the entire list does not hold a url that has the phrase "curation-nation" in the url
Experiment 3: search on "curation"
returned 24 results
1) all results had either "curation" or "Nation" - not desired result
2) all the "#" signs were removed - completed successfully
3) the following do not hold a url that has the word "curation" in the url
/National-Geographic-1
/Washington-Nationals
/National-Football
/Who-curates-the-curators
/Why-isnt-the-National-Museum-of-the-American-Indian-more-like-the-Holocaust-Museum
/Nations
/Bling-Nation
Experiment 4: search on "blah-blah"
returned 15 results
1) no results should have been returned. 15 were returned which have no mention of the word "blah-blah" in the url.
/Can-content-curation-become-mainstream
/National-Geographic-1
/What-are-the-best-content-curation-tools-for-daily-use
/Web-Content-Curation/Who-are-the-best-tech-content-curators-to-follow-in-2011
/Data-Curation
/search?q=curation+nation&context_type=&context_id=
/Social-Curation
/Who-curates-the-curators
/search?q=curation&context_type=&context_id=
/Nations
/When-does-content-curation-become-content-creation
/search?q=content+curation&context_type=&context_id=
/United-Nations
/Live-Nation
/Content-Curator
It would be great if you guys who are seasoned programmers could look at this code and fill in the gaps so we can see the proper way to do this.
Monkey see - monkey do!! haha
Your feedback would be appreciated as I believe this will help many of us.
Thanks upfront













