I saw in another post that Ally publishes all of its RV data, including historical RV data. I didn’t know about this! Of course, Ally’s RV are probably very different from captives. But I still think there is probably a lot of useful information in there:
How much do RVs fluctuate historically?
Can we identify or predict trends in RVs based on prior months?
Are there any obnoxiously high RVs right now? (Cough Tacoma cough)
The data is in PDFs. For example, here is the current version. Edit: Apparently Ally really does not want you to link directly to anything on their site. Go to Ally dealer services, click Tools (upper right side of screen), and then Residual Value Lease Guide (RVLG).
I converted this with pdftotext and got something like the following, which looks annoying but probably parseable:
New Vehicle Residual Values
Model
Year
Make
2019
VOLVO
Model
Description
12
24 27 30 36 39 42 48 60
4dr Sdn T5 Momentum (860136 &105)
48
48 47 45 43 41 40 36 28
4dr Sdn T5 R-Design (860136 &111)
48
48 47 45 43 41 40 36 29
I suspect there are better tools for extracting data from PDFs. Any ideas?
Well if we’re talking over multiple model years in ‘17 they had crazy high 24 month Mercedes RVs. Iirc Micheal scored a crazy e class deal, I wasn’t here during that time.
2, , this is like trying to predict programs, @RVguy would be the guy to talk to.
My robotics club did something similar using google cloud in terms of pdf to text. It’s free for the first 1k images, does azure have something similar.
Python-pdfminer seems a lot more powerful. I’ve been playing with the included pdf2txt to dump HTML, but I haven’t been able to get it to group elements of each row together, presumably since they are so far apart. But I suspect that is possible, and possibly easy from the python API.
This might be extremely interesting, but the issue is that the data collection is going to be very time consuming if you want it to reflect RV for all banks.
Other than the Edmunds forums, which can help you a few zip codes and models at a time at most, it is going to be very difficult to collect the data needed to make these calculations for captive lenders such as BMWFS and VCFS.
Agreed completely. My plan is just to analyze the Ally data since it’s there and see if we can do anything useful with it.
I’d love captive data, but that’s more complex, of course. Maybe we can use the Edmunds Deals? I also just saw that this morning. @RustyDaemon want to start scraping that too?
It’s hard to find at first but I was able to get it to work using the instructions in the post. Let me know if you keep having trouble and I can take some more screen shots.
Due to my job, I’ve got all the historical data on every lender in most regions. RVs, rates, residuals and all applicable rebates. Going back about 6 yrs.
I have tried forecasting each component on every lender and found that there is nothing predictive within the Ally RVs themselves. Their adjustments are all done relative to ALG’s RVs and those adjustments (usually in the +5 to +8 range) occasionally change in tandem with their rate changes or if they push into a new brand.
Plus there are very few models currently where Ally has the best program.
US Bank is almost always -1 or -2 relative to ALG. But their rates are changing more frequently at the model-level to feather volume.
There is a fairly cheap desking tool called LeaseScan that dealers pay about $1k/mo that may give you access to all the programs around the country. I’m sure they would license it to a broker. From the screen shots and a YouTube video with a demo of it back in 2011, it’s pretty retro and I doubt much has changed.
Maybe a stupid question but Ive always asked myself, if edmunds forums has RV for vehicles wouldn’t it be much easier for them to have that data as a drop down / form where folks could look it up themselves? All they would have to do is update it monthly. It seems like a better approach than answering questions on RV to folks on the forums.
From what I’ve gathered, the key to extracting and maintaining RV/MF data, via scraping or whatever method does the trick, is never posting that data in one place for anyone to see…allowing only for it to be accessed via request or as part of a larger output (a calculator, for example)