Multi-brand new/used car inventory search: an evolution story of python script to a web scraper to an API driven webapp

Awesome, thanks for reviewing. I’m new with GitHub too so glad it went through ok. (The way I looked for dealer type is by searching the Network Inspector URLs for dealer.com, dealerinspire and dlron. Will use your way next time)

I found an issue when searching for 440i model. It returns a bunch of 330 models too - I added it to GitHub issues as I figured that was a better place to put it than here.

Have you thought about doing offline processing at a regular schedule, and then just serving the results from a key value store? You could update every day or so. Would speed things up at the cost of potentially returning stale results

1 Like

Damn, this is next level! One of the most interesting threads to date on LH for sure :raised_hands:

Was thinking about this, maybe search all the models in a list and aggregate the data. Instead of user typing in the make and model, just let them pick make and model and it would display it.

@RustyDaemon This is a long shot but I don’t mind helping you do this:

  1. Search the page for a iPacket link; or
  2. Manually scrape iPacket with the VIN # from the aggregated list to see if it exists.

That is great, we have 2 contributers now, thanks for doing that! Free text model search is tricky, in terms that user input is directly plugged into building URLs for some search types… Not ideal by any means, but was quick to implement. Ideally we should present user with model selection choice, like dropdown of sorts, and then translate it properly into model representation that is most acceptable for dealer type processing. Will require lot of effort though, maybe something to add to a todo list. I’ll look at 440i example and see what we do to fix it.

Yeah, already thinking about how to approach this. Stale results could be an issue though, kinda defying the purpose of the app being true real-time inventory. Another way is to spin search asynchronously, and for UI just to poll for progress until its done. But that will also require complete revamping of UI, ASP.net webforms are not meant to do stuff like that natively, or without fancy and expensive 3rd party controls. If someone with knowledge of modern UIs could help… I could expose core search functions as API endpoints even, so UI could be written in anything that can consume API calls… Lots of options here, looking for quickest and easiest one currently.

Oh yeah that would be great too. Would likely need some kind of clustering to group things like “RS 3” and “RS3” together, otherwise the list would be pretty full of similar entries.

BTW, I’m not trying to give instructions from the sideline. Pretty busy with work related stuff right now, but I’d love to help out on the UI side of things soon.

2 Likes

You mean to add ipacket info to listings as feature? I’ve seen lots of dealer results with ipacket links, not all of them though. Yeah, this can be added fairly quickly… Another column in results grid with Ipacket link for results that have ipacket?

Yup, seems to be three of following dealers, different URLs than generic DealerCom URL for search. I added comments to the issue. Great idea to track issues in GitHub, lets track all of open issues there for simplicity sake.

Yes, so add that as a column and this could be a Longshot but be able to pull features from the MSRP sheet. Adding the link is a great start though!

There’s probably too much variation from site to site on how packages and options are listed (ex. some dealer sites don’t even manage to post even a partial list of features on their sites or anything closely resembling the true window sticker)…but an awesome next step would be to pull packages/options and match them with the pricing guide for each make/model. Just food for thought.

Really want to be helping more on back end but have been absolutely crushed in work/personal life in the last week :sweat:

Some dealers have these nice links to vehicle records:


The actual dealer link to the vehicle:
https://www.eurobethesdamercedes.com/new-inventory/index.htm?year=2020&superModel=G-Class&referrer=%2Fnew-inventory%2Findex.htm&lastFacetInteracted=
1 Like

Would be incredibly helpful if all dealers used the following structure, providing all packages with their “MSRP”

versus something wholly unhelpful like this, which often makes you search to find what options are actually included.

Interestingly, both websites seem to use the dealerInspire platform, yet clearly have different disclosures on packages and options, whether intentional or not.

Any thoughts on this?

So I added support for custom URLs for DealerCom dealers that have custom URLs, but still support searching via query string parameters by model. For example,

{
      "make": "BMW",
      "name": "Long Beach BMW",
      "url": "https://www.longbeachbmw.com",
      "customurl": "https://www.longbeachbmw.com/new-bmw/long-beach.htm?superModel={0}",
      "dealertype": "DealerCom"
}

vs regular DealerCom dealer, whose search URL is like this:

www.dealername.com/new-inventory/index.htm?model={0}

This allows us to add 3 offending Cali BMW dealers back, and they won’t pollute search results with unrelated models anymore:

  • “Long Beach BMW”,
  • “BMW of Monrovia”,
  • “Beverly Hills BMW”,

@AndieNarwhal, I believe issue with 440i search that you reported is closed.

I think that those platforms are highly customizable, and some dealers just don’t want to include all features that platform supports, like list of packages/options and such. Even URLs between the dealers on seemingly same platform can be set up differently, as evident by issue with 3 Cali BMW dealers above. Unfortunately, no simple solution, I feel that the more dealers we add, the more tweaks to generic approach of scraping specific platform will be needed.

Those are my favorite dealers, where you can just open PDF and see full build with all the options/packages.

I’m thinking of starting with small steps, detecting ipacket link presence and including it as a link in search result row would be good step 1. Visiting link programmatically and attempting to parse PDF would be huge step 2, probably not as easy as 1st one lol.

3 Likes

Yeah I know. Some put the MSRP options in the listing some don’t, but all the iPacket links are the same. He would be able to just search for that logo just like he did for volvo loaners or could scrape that page for a ipacket link.

This is exactly what I’m referring to when I say ipacket links.

Yes, I thought about iPacket after reading @Rsantoro12’s post

The only problem is that some upload PDF and some upload image for the MSRP sheet.

Yes, I’ve seen both.

Added IPacket link detection for DealerInspire dealers, only had time to test it on one MB dealer though.

Linking directly to MSRP section of IPacket. If someone could provide samples for dealerOn and DealerCom listings containing IPacket links, I’d wire that too.

Eh, silly me, @Ursus link right here is Dealercom with Ipackets. I’ll add that next. The only example remains to be found is dealerOn with ipacket links on model search page

Edit: added for DealerCom too. Only one remains is dealerOn.

1 Like