Smartcrawl - API issue

I'm having a weird issue though, my Smartcrawl is trying to scan the incorrect site for both URL crawl and SEO Checkup.

The site is https://cy**s.co.uk

but it tries to scan https://funis.cy**s.co.uk

This site doesn't even exist anymore. I've scanned every file on the server and every table in the database and nothing references that domain, so I can only presume its your end.

I've tried uninstalling the plug-ins, disconnecting from the Hub and starting again and still the same

After delete funis.c**s.co.uk from my account, Smartcrawl start referencing the stage.c**as.co.uk. Only after remove this one too, Smartcrawl reported the correct site.

Is this a new bug where smart crawl uses subdomains instead of actual domains?

  • Adam Czajczyk
    • Support Gorilla

    Hello JazzyDan

    I hope you're well today and thank you for reporting this.

    I admit it's a weird issue and while we did have some similar reports in the past, they were all directly related to glitches in sites' configurations (e.g. some redirects set, wrong domains still in the database etc). But so far we didn't have any reports like yours - when there's no reference to these other sites anywhere and yet they are scanned.

    If I correctly understand - now, after removing those other sites, SmartCrawl is scanning correct site. Is that right? I just want to make sure.

    I'm also discussing that with our developers to see if we can find out whatever could cause that.

    Kind regards,
    Adam

    • JazzyDan
      • The Crimson Coder

      Yeah its pretty odd.
      But I scanned every single table in the database for cyfas.funis, and did a find on all files as well, and nothing.
      And merely removing cyfas.funis from the hub made it then scan stage.funis, and removing stage.funis finally made it scan Funis.com.

      So weird.
      Thanks
      Dan

  • Adam Czajczyk
    • Support Gorilla

    Hi JazzyDan

    I'm still talking to our devs about that but I must say we're still quite confused with this. That should not have happened. There could be a possible (even though it would assume some "glitch" either on yours or our side) explanation if the site had the domain switched/was migrated between these domains and something were not "complete" during migration (like e.g. some redirects set but URLs not properly changed in the db).

    But I know from your chat that this is not exactly the case. However, I'm wondering about one other thing: were there any redirects set in the past between any of these domains? I mean "redirects" as in 301 or 302 or similar redirect on a "server" level? Some aliases or forwards maybe?

    Best regards,
    Adam

  • Adam Czajczyk
    • Support Gorilla

    Hi JazzyDan

    I assume that you're using this option to make sure that a given domain is always accessed via the HTTPS protocol rather than redirecting form one domain to another, right? But I'm not much familiar with this particular tool (as a Plesk feature) so I'm not sure how it works exactly.

    I would say that if there were no redirects set "from one domain to another" that shouldn't be a problem but if this (or similar tool/feature) was used e.g. to redirect one domain to another (like, e.g. production to staging or "old domain" to "new one) - in context of any of the three addresses involved in the case - that could be part of the problem indeed, as 301 is a permanent redirect and crawlers would "remember" it.

    But in my opinion it's rather unlikely that this is related - if it wasn't used to redirect between any of these domains but only within the same domain.

    I must admit we're still quite confused with the case as we (our developers to be exact) still couldn't find anything in the plugin's code and in our logs that could possibly cause this. I'm actually wondering now what would happen if you'd re-add any of those sites that were removed from the hub. For example: if you'd set - I think just the most basic, simple WP install would do -
    a "stage.funis...." site now, set WPMU DEV Dashboard on it and logged it back to your account. I'm wondering if SmartCrawl would then "jump back" to it or not.

    That might be worth giving a try (you can always remove it again) and might actually give us some additional insight into the case if it would indeed "jump back". Do you think you could give it a quick try?

    Best regards,
    Adam

    • JazzyDan
      • The Crimson Coder

      Agreed, I think its unlikely the Plesk redirect is the issue, as I use this on the majority of my sites, and they do not exhibit this issue.

      I have reconnected stage.cyfas.co.uk and that has caused cyfas.co.uk to scan the incorrect site again. I'll leave this connected so you can investigate further, but please keep in mind this is a live site that is supposed to be gathering SEO reports from Moz to present to my client.

      Thanks again for your help

  • Adam Czajczyk
    • Support Gorilla

    Hi JazzyDan

    Thank you for giving it a go, that's an important information that it "jumped back".

    I realize that the site is live, though I actually completely forgot that it's Friday afternoon already (UTC time) and our developers might be a bit less "reachable" as, unlike support team, they are not necessarily working 24/7. I'm sorry about that. Do you think you could - in case I didn't have any information earlier - keep it set that way until Monday? Just in case.

    Meanwhile, could you also enable support access to the cyfas.co.uk so I could take a look at the configuration and also fire up the scan there and see how it goes? You can enable support access on "WPMU DEV -> Support -> Support Access" page in site's back-end, just let me know here once it's done please as I won't get any automatic notification.

    Best regards,
    Adam

  • Adam Czajczyk
    • Support Gorilla

    Hi JazzyDan

    Thank you for keeping the site connected and for granting access.

    I did a pretty detailed check on site and I can confirm that there's nothing that even suggests what could possibly be causing the issue. Furthermore, the sitemap points to the proper URLs and if I run crawl from our end, it does scan the right site - yet there are references in the report to the "stage" site.

    Using the "Search & Replace" plugin that you got installed on the site (I only used "dry run" to check, not to make any changes) I did found also that SmartCrawl is indeed storing the "stage" URL in _options table in the db as "checkup_url" value but that's not really a setting that it follow but rather a log of which site was crawled - so that confirms that it scans wrong.

    I'm still waiting for our developer's (I believe I should get some more info/be able to discuss that with them directly tomorrow) so I'll update you here again but I was also thinking that perhaps I could take a look at the server/db directly as well?

    I'm aware that you did check it all carefully but there's always a slight chance that I may spot something. As they say, four eyes see more than two. At least sometimes :slight_smile:

    That said, it would be great if you could provide me with access credentials. The cPanel/Plesk or any panel that you're using would be very handy.

    Note: Don't leave your login details in this ticket.
    Instead, you can send me your details using our contact form https://premium.wpmudev.org/contact/#i-have-a-different-question and the template below:

    Subject: "Attn: Adam Czajczyk"
    - Site login URL
    - WordPress admin username
    - WordPress admin password
    - FTP credentials (host/username/password)
    - cPanel or Plesk or similar panel credentials (host/username/password) <- that's preferred; if not available, please include direct URL to phpMyAdmin
    - Folder path to site in question
    - Link back to this thread for reference
    - Any other relevant urls/info

    Best regards,
    Adam

  • Adam Czajczyk
    • Support Gorilla

    Hi JazzyDan

    Thank you for providing me with credentials.

    I used them to access the site again and to check server and I reviewed every single aspect that could possibly (or even remotely) related and I must say- I'm still at the same point :slight_frown: I see nothing that could be causing it. I tend to admit that it might be something on our end or we both - me and you - are missing something.

    I took a liberty of passing these access credentials that you sent to our developers in addition to my earlier report to them so they would also be able to access site/server and do yet another check. We'll update you back here as soon as possible.

    Best regards,
    Adam

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.