One Yukkuri Place

Read the rules before proceeding!

Topic: Capitalized Tags

Posted under Tags

EasyV

There are a bunch of old pictures with capitalized tags like "Wriggle" instead of "wriggle"
How can I look them up?
If I search them as-is, the system gives me the uncapitalized results

  • ID: 11822
  • Permalink
  • poweryoga

    do you have any examples? I wasn't aware capitalization affected search.

  • ID: 11823
  • Permalink
  • Ruukasu

    EasyV said:

    There are a bunch of old pictures with capitalized tags like "Wriggle" instead of "wriggle"
    How can I look them up?
    If I search them as-is, the system gives me the uncapitalized results

    Can you give us the URL for an item with a capitalized tag?

  • ID: 11825
  • Permalink
  • EasyV

    poweryoga said:

    do you have any examples? I wasn't aware capitalization affected search.

    It doesn't per se, but they will be mistagged in that "wriggle" will be considered a generic tag instead of a character tag, and as such those picture might not appear when looking for wriggle

    Ruukasu said:

    Can you give us the URL for an item with a capitalized tag?

    Yeah, that's going to be a problem as I fixed those I found, and thought about asking only after the fact
    If I find one again I'll link it here

  • ID: 11826
  • Permalink
  • poweryoga

    if you have examples I can do a mass tag edit. The only alternative is to dig through the DB and I don't really want to do that since it's sort of like needle in a haystack.

  • ID: 11827
  • Permalink
  • EasyV

    poweryoga said:

    if you have examples I can do a mass tag edit. The only alternative is to dig through the DB and I don't really want to do that since it's sort of like needle in a haystack.

    Actually I just realized the system logs every tag change, so we can see the capitalized tags even though I changed them
    The tag history of post #32420 shows a "Wriggle", while post #12495 had "Anko-chan" instead of "anko-chan"

    Edit:
    I also discovered post #20641, which has "Sanae"

    Updated by EasyV

  • ID: 11828
  • Permalink
  • poweryoga

    so for example, does 32420 not show up if you search for "Wriggle" meaning the search is case sensitive but for some reason the results aren't?

    edit: so to answer my own question: the capitalized tags don't seem to show up in searches. They look to be relics from the danbooru 1 -> danbooru 2 migration and possibly from shimmie originally. All tags are now (by default) squashed down to lower case, and searches only turn up lower cased results as well.

    The way you can get these "bad" tags to show up is to really do a wildcard search, which unfortunately returns some false positives as well. For example: if you want to find "wriggle" and "Wriggle" you can use "*riggle". Do note wildcard searches are SLOW if there's a lot of hits.

    Here's the bad news: there's no good way to do a bulk-edit for these bad tags since they are technically bad data.
    Here's the good news: if you give me the tags I can give you a list of posts to fix (manually).

    For example: "Wriggle" is in the following post IDs:

    30312
    30309
    30031
    19351
    5820
    30308
    30307
    4708

    30313
    19358
    13563
    32285
    26634
    12302
    10842
    32724
    4952
    18439
    25997
    8266
    29924
    18440

    I've gone ahead and fixed some of them but it's somewhat late here and I'm going to sleep. But we have a solution for this, so yay! I can write up a wildcard search for this tomorrow and dig up all the "Bad tags" that we can fix.

    edit edit: You don't actually need to "fix" the tags either. Just edit and submit and it'll squash it down nice and proper.

    Updated by poweryoga

  • ID: 11830
  • Permalink
  • EasyV

    Thank you easy Dosu!
    Since "edit" is all it's needed, I think I'll use the "quick edit" function (in the same menu as mass tagging) and see if I can get at least Wriggle down
    Though I'm wondering if there's a way to find all capitalized tags, instead of relying on pure luck (I found those through the random post function)
    Maybe a database query? Pattern matching should be case sensitive if I recall correctly

    Edit: There should be no "Wriggle" anymore, though there might still be some "Sanae", and I saw at least one "Kanako"

    Updated by EasyV

  • ID: 11836
  • Permalink
  • Ruukasu

    EasyV said:

    There should be no "Wriggle" anymore, though there might still be some "Sanae", and I saw at least one "Kanako"

    I found one out of two tagged as "Kochiya_Sanae". It has since been fixed.

  • ID: 11837
  • Permalink
  • poweryoga

    Unfortunately this isn't as easy as I thought since the string is a big concatenated list of the tags, so I'll have to try to parse the string or figure out how write a fulltext search. Will play with this some more.

  • ID: 11839
  • Permalink
  • 1