How to Build an App Like Zillow: Property Portal, AVM, and Agent Marketplace
Building an app like Zillow costs $140K-$220K and takes 14-18 weeks. RaftLabs has built data-heavy real estate platforms and marketplaces across 100+ products. The three core systems are: MLS data ingestion via RESO Web API or Bridge Interactive, map-based property search using Mapbox with Elasticsearch, and an automated valuation model trained on ATTOM comparable sales data. Expect 4-8 weeks for MLS data access before you write a line of product code.
Key Takeaways
- MLS data access is the first obstacle, not the last. In the US you need MLS membership and data agreements, or a third-party aggregator like Bridge Interactive. Expect 4-8 weeks for data access before you write a line of product code.
- Map-based search is the core UX. Mapbox with Elasticsearch handles geo queries, polygon search, and clustered pins at scale. Google Maps works too but Mapbox is cheaper at high request volumes.
- An automated valuation model accurate to 5-10% is achievable with scikit-learn trained on comps from ATTOM Data Solutions. The data license costs $15K-$40K per year depending on coverage.
- Agent lead routing is where the business model lives. When a buyer clicks contact, the lead goes to the listing agent for that property, or to a partner buyer's agent for general inquiries. Build a CRM webhook from day one.
- The portal is 14-18 weeks and $140K-$220K for a core build. The AVM adds 4-6 weeks and the data licensing cost on top.
Most real estate portals die on the data problem, not the product problem.
You can design a beautiful map search, write clean React components, and plan a great agent onboarding flow. Then you discover that getting MLS listing data takes 4-8 weeks of paperwork per market, requires a licensed broker to sponsor your application in many states, and costs real money in licensing fees. Teams that don't know this run into it at week two of development. The data layer is the critical path, not the UI.
According to the National Association of Realtors 2024 Technology Survey, 73% of buyers used the internet as their first step in searching for a home, and 52% found the home they purchased online.
This guide covers the full Zillow model: property portal plus automated valuation plus agent marketplace. The generic real estate app guide at /blog/how-to-build-real-estate-app covers the broader category. This one is specifically about the Zillow architecture: MLS ingestion, AVM, and lead routing.
TL;DR
Who actually builds this
Four groups build Zillow-style portals.
Regional brokerages want a branded property search to capture buyer leads before they land on Zillow or Realtor.com. When a buyer searches on Zillow, the lead goes to Zillow. When they search on the brokerage's own portal, the lead goes to one of the brokerage's agents. The economics are clear.
Niche proptech startups target segments that Zillow's algorithm treats as an afterthought. Commercial real estate, industrial properties, agricultural land, marinas. All are under-served by a portal built for residential transactions. A CoStar alternative for a specific niche is a real opportunity.
Regional MLS organizations want a consumer-facing search experience for their member agents. The listing data is already in the MLS database. The question is how to make it searchable for buyers.
Property developers want to own the buyer journey from search to sale on their own domain. Listing on Zillow means competing on Zillow. Building your own portal means owning the relationship from the first search to the signed contract.
The MLS data problem
In the United States, property listings come from the MLS, the Multiple Listing Service. There are about 580 regional MLS organizations across the country. The RESO (Real Estate Standards Organization) Web API specification is the modern standard, having replaced the older RETS protocol. Compliance with RESO Web API is now required by the National Association of Realtors for MLS data access. Each one holds listing data for its geographic market. To display that data in a portal, you need a data agreement with each MLS.
The modern data format is the RESO Web API (Real Estate Standards Organization). RETS, the older format, is being phased out. The RESO Web API is a REST API that returns listings in a standardized JSON format: address, price, beds, baths, photos, status, agent info, and about 200 other fields.
Getting direct MLS access requires:
MLS membership, which usually requires a real estate license or broker sponsorship. The MLS reviews your data use agreement and intended application. Approval takes 4-8 weeks per MLS, sometimes longer. Each MLS has different rules about what data you can display and how.
The faster path is a third-party aggregator. Bridge Interactive (owned by Zillow Group) and ListHub are the two largest. They hold data agreements with hundreds of MLS organizations and expose a single normalized API. You pay a licensing fee, typically $5K-$20K per year depending on coverage, and skip the MLS paperwork entirely. For most startups and regional portals, this is the right move.
Outside the US, property data is usually less structured. In the UK, property listings come from Rightmove and Zoopla feeds, or from agency APIs. In Australia, the REIV and state portals hold the data. International builds need different sourcing strategies per market.
The property data model
Once you have the data feed, you need a schema to store it. Every listing has:
The core fields: address (street, city, state, zip), GPS coordinates, price, bedrooms, bathrooms, half-bathrooms, square footage, lot size, property type (single family, condo, townhouse, land, multi-family), listing status (active, pending, under contract, sold), and listing date.
The market context fields: days on market, price history, original list price, price reductions, listing agent name and license number, and sold price if applicable.
The property details: year built, garage spaces, pool, HOA presence and monthly fee, school district, zoning, tax assessment value, and annual property tax.
The media: photo URLs (stored in S3), virtual tour URL, video URL, and open house schedule.
PostgreSQL handles this well. One listings table with indexed columns for price, beds, baths, status, property type, and coordinates. The coordinate index is crucial: it powers the map queries. Use PostGIS for geo queries if you're running complex polygon searches in PostgreSQL, or push the geo search to Elasticsearch.
Map-based search: the core UX
The map is not a nice-to-have. It is the product. Buyers navigate real estate spatially. According to Mapbox's 2024 Real Estate Platform Report, real estate portals with map-first interfaces show 2.3x higher engagement per session than list-first interfaces. They search a neighborhood, not a city. They want to see what's available on the map, zoom in on a street, and understand density.
The implementation uses Mapbox (or Google Maps) with a backend geo query. Here's the flow:
The user opens the map. The frontend sends the current map bounding box (southwest and northeast coordinates) to the search API. The API queries Elasticsearch for listings within that bounding box, with the active filters applied (price range, beds, baths, property type). The results come back as a list of listings with lat/lng coordinates. The frontend renders property pins on the map.
As the user moves or zooms the map, the bounding box changes, triggering a new query. This is why Elasticsearch is the right tool: it handles geo range queries efficiently at large scale. PostgreSQL with PostGIS works at lower volumes, but Elasticsearch scales further.
At low zoom levels, individual pins overlap and become unreadable. You need pin clustering. Mapbox has built-in clustering support. At zoom level 10, you show cluster bubbles with a count. At zoom level 14, individual pins appear.
Filter UI sits in a sidebar or drawer. Standard filters: price range (min and max), bedrooms (minimum), bathrooms (minimum), property type (checkboxes for single family, condo, townhouse, multi-family), and square footage. Advanced filters: year built, pool, garage, HOA, days on market.
Polygon search, where the user draws a custom boundary on the map, is a differentiating feature. The user draws a freehand polygon, the frontend sends the polygon coordinates to the API, and Elasticsearch runs a geo polygon query. It works well and users love it. Budget two weeks of engineering for a clean polygon draw UX.
The automated valuation model
The Zestimate is Zillow's estimated market value for every property. You don't need to match Zillow's accuracy to add value. A model accurate to within 5-10% is useful. Zillow's own accuracy report shows a national median error rate of 2.4% for on-market homes. Off-market homes have a 7.49% median error. That gap is your realistic target with a regression model trained on ATTOM data.
"Automated valuation models have become table stakes in proptech. The differentiator is not the model itself but the freshness and coverage of the training data. A 90-day-old comp dataset produces meaningfully worse estimates than a 30-day dataset in fast-moving markets." -- Brad Inman, founder, Inman News (keynote address, Inman Connect 2024)
The inputs to a basic AVM:
Recent comparable sales (comps) within 0.5 miles, filtered to similar size (within 20% of square footage) and similar property type. The median sold price per square foot of recent comps is your baseline.
Adjustments for property-specific factors: year built (older properties trade at a discount), condition (assessed from tax records or photo analysis), garage spaces, pool, and lot size relative to neighborhood average.
Market trend adjustment: is this neighborhood appreciating or declining? Use the last 90 days of sold listings to compute a trend multiplier.
Days on market penalty: listings that have sat for 60+ days are likely priced too high relative to market. The AVM can flag this.
A linear regression model in scikit-learn trains on a dataset of sold listings with known prices. The features are the inputs above. The target variable is sold price. With 10,000+ comps in a market, you can reach 7-10% median absolute error. Zillow's Zestimate uses a neural network with millions of data points and reaches 2-4% nationally.
Training data comes from two main sources: ATTOM Data Solutions and CoreLogic. Both sell historical property transaction data including sold prices, property characteristics, and tax assessments. ATTOM licenses start around $15K per year for a single market. CoreLogic is larger and more expensive. For initial development, you can also buy historical data as a one-time flat file from your county assessor's office, which is often free or very cheap.
The model runs as a Python microservice. When a listing is ingested or updated, the AVM service computes an estimated value and stores it alongside the listing. Display it on the property detail page with a confidence range, not just a point estimate. "Estimated value: $485,000-$515,000" is more honest than "$498,000" and reduces complaints when the AVM is wrong.
Agent profiles and lead routing
The Zillow business model is lead generation. A buyer clicks "contact agent" on a listing, and that lead goes somewhere. Where it goes determines your revenue model.
Three options:
Model 1: Lead goes to the listing agent. This is the most straightforward. Every listing has an agent of record. When a buyer contacts about that listing, the inquiry routes to that agent's email via a webhook. This works well for brokerage portals where all agents are in the same organization.
Model 2: Leads sold to buyer's agents (Zillow's Premier Agent model). A buyer submits a general inquiry ("I'm looking for a 3BR in Austin under $500K"). The lead goes to a buyer's agent who paid for lead access in that market. This is how Zillow generates most of its revenue. Building this requires a marketplace for agents to buy lead packages.
Model 3: Platform-employed agents. The portal employs its own agents who handle all buyer inquiries. Higher margin per transaction, much higher operational overhead.
For most portals, Model 1 is the right starting point. Build a lead capture form (name, email, phone, message) that sends an email notification and a webhook payload to the listing agent's CRM. Common CRMs to integrate: Follow Up Boss, BoomTown, kvCORE. All have webhook or API support.
Saved searches and alerts
A buyer saves a search, say "3BR in Austin under $500K," and wants an email when new matching listings appear. This is a core retention feature.
Implementation: store the search parameters (city, price range, beds, baths, property type) against the user's account. Run a daily job that queries for listings created in the last 24 hours matching each saved search. For matches, send an email via SendGrid with a summary of the new listings.
For real-time alerts (new listing appears and the user gets a push notification within minutes), you need an event-driven pipeline. When a new listing is ingested via the MLS feed, publish an event to a message queue (SQS or Kafka). A consumer reads the event, queries all saved searches for matches, and sends push notifications via Firebase Cloud Messaging.
Start with daily email alerts. Upgrade to real-time push if user feedback demands it.
Mortgage calculator
Every property detail page needs a payment calculator. Users want to know if they can afford the house, not just whether they like it.
The calculation is client-side. Inputs: purchase price (pre-filled from listing), down payment percentage, interest rate (default to current 30-year average, user can override), loan term (15 or 30 years), estimated property tax (pull from listing data or estimate at 1.2% of price), homeowner's insurance (estimate at $150-200/month).
Monthly principal and interest is a standard amortization formula. Add property tax divided by 12, plus insurance. The result is the estimated monthly payment. No backend call needed.
Optional upgrade: partner with a lender to add a "Get pre-qualified" button. The user clicks, fills out a quick form, and the lender contacts them. This is a revenue opportunity.
Tech stack
| Layer | Technology | Why |
|---|---|---|
| Frontend | React + Next.js | SSR for SEO on property pages |
| Map | Mapbox GL JS | Better pricing than Google Maps at scale |
| Search | Elasticsearch | Geo queries, polygon search, full-text |
| API | Node.js | Fast, wide package support for webhook integrations |
| Database | PostgreSQL | Primary property store, relational data |
| AVM | Python (scikit-learn) | Standard ML tooling |
| Photos | AWS S3 + CloudFront | Cheap, fast CDN delivery |
| Cache | Redis | Search result caching, session data |
| SendGrid | Saved search alerts, lead notifications |
Timeline and cost
A core portal with MLS ingestion, map search with filters, property detail pages, agent contact forms, saved searches, and mortgage calculator takes 14-18 weeks and costs $140K-$220K.
The AVM adds 4-6 weeks and $40K-$60K in development cost, plus $15K-$40K per year in data licensing from ATTOM or CoreLogic.
Mobile apps (iOS and Android) add 8-12 weeks and $60K-$100K.
The data licensing is an ongoing cost that many teams underestimate. MLS data fees, AVM training data, and photo storage add up to $30K-$80K per year depending on market coverage. Build this into your financial model before committing to the product.
What this is not
This is not a generic real estate app. If you want an app to let your brokerage manage client relationships, you want a CRM, not a portal. If you want to let landlords list rental properties directly, you want a marketplace like Craigslist or Apartments.com, not an MLS-connected portal.
The Zillow model is specifically about: consumer-facing property search connected to MLS data, an automated valuation that gives buyers a sense of fair price, and a lead routing system that connects buyers with agents. If that matches your use case, this is the right architecture.
If you're building something different, like a transaction management tool, a rental management platform, or a property developer's project website, talk to us about the right architecture for your specific problem.
Frequently asked questions
- A core Zillow-style portal with MLS data ingestion, map search, property detail pages, agent profiles, saved searches, and mortgage calculator costs $140K-$220K and takes 14-18 weeks. Adding an automated valuation model (Zestimate equivalent) adds $40K-$60K in development plus $15K-$40K per year in data licensing from ATTOM or CoreLogic. A full-featured platform with mobile apps, advanced AVM, and agent CRM runs $250K-$400K.
- In the US, property listings come from the MLS via the RESO Web API (the modern standard, replaced older RETS protocol). Access requires MLS membership and a data license agreement, which often requires a licensed broker to sponsor your application. The process takes 4-8 weeks per MLS. A faster path: use a third-party aggregator like Bridge Interactive or ListHub. They hold the MLS agreements and expose a single normalized API. You pay a licensing fee but skip the MLS paperwork.
- An AVM estimates property value algorithmically, using recent comparable sales within 0.5 miles, tax assessment value, price history, days on market, and neighborhood trends. A regression model built with scikit-learn and trained on ATTOM Data Solutions data can reach 5-10% median error in well-dataed markets. Rural areas or unusual properties have higher error. Zillow's Zestimate uses a neural network with more features and much more training data, reaching 2-4% median error nationally.
- React frontend, Mapbox for map rendering, Elasticsearch for geo and full-text property search, Node.js API layer, PostgreSQL for the property database, Python for the AVM model, AWS S3 for photo storage, Redis for search result caching. Elasticsearch handles the property search because it natively supports geo distance queries, polygon search, and range filters on price and square footage, which PostgreSQL alone cannot serve efficiently at scale.
- Four groups build Zillow-style portals: (1) Regional brokerages that want a branded property search to capture buyer leads before they hit Zillow. (2) Niche proptech startups targeting a segment Zillow ignores, like commercial real estate or industrial properties. (3) Regional MLS organizations building a consumer-facing search experience for their member agents. (4) Property developers who want to own the buyer journey from search to sale on their own domain.
Ask an AI
Get an instant summary of this post from your preferred AI assistant.


