cityofphiladelphia / ais Goto Github PK
View Code? Open in Web Editor NEWAddress Information System
Address Information System
Address summary has rows for these 3 addresses: 949-51 N LAWRENCE ST, 951 N LAWRENCE ST, 949 N LAWRENCE ST.
The address link table lists 949 and 951 Lawrence St is as “in range” of 949-51 N Lawrence St, but does not have an entry for the ranged address version (949-51).
Block search uses exclude_children fct to filter out addresses with address_link relationship=”in range”. (This happens for other addresses having this relationship as well)
This ranged address (949-51) also doesn’t have an OPA account #, so none of these show up in the block search response.
Truncate any input to 80 char
Reject any record that has non ascii chars
ex. /block/asdoiuzxcv
throws 500
status code, implying server error. Should be 404
to match with bad /addresses/
searches.
254 W PASTORIUS ST # 254 - true_range
254-56 W PASTORIUS ST - pwd_parcel
254 W PASTORIUS ST - pwd_parcel
Address "505 W MOUNT AIRY AVE"
does not appear among the results for 500 block of W. Mt. Airy Ave. (nor does 501, 503, ..., 513, but 515 and above do, as do the even values on the block):
These need to be removed in the next build.
Currently address geometry is stored as text values in geocode_x, geocode_y fields (srid: 2272). Create geometry in this table for users to easily explore data spatially.
If you search an /addresses/randomzxoicuvk
it returns an HTTP status code of 404
the first time, but if you hit refresh it returns 200
every time after that. Could this be gatekeeper caching it? Hopefully it's just a flasky thing.
/addresses/WELSH RD AND FRANKFORD AVE
seems to return all WELSH RD
addresses. Passyunk is typing this is an intersection_addr
.
{'components': {'address': {'addr_suffix': None,
'addrnum_type': None,
'fractional': None,
'full': '1',
'high': None,
'high_num': None,
'high_num_full': None,
'isaddr': False,
'low': None,
'low_num': None,
'parity': None},
'base_address': 'WELSH RD & FRANKFORD AVE',
'bldgfirm': None,
'cl_addr_match': 'RANGE:2',
'matchdesc': None,
'responsibility': 'TOWNSHIP',
'seg_id': '1060825',
'st_code': '82240',
'street': {'full': 'WELSH RD',
'is_centerline_match': True,
'name': 'WELSH',
'parse_method': '2ANS',
'postdir': None,
'predir': None,
'suffix': 'RD'},
'street_2': {'full': 'FRANKFORD AVE',
'is_centerline_match': True,
'name': 'FRANKFORD',
'parse_method': '2ANS',
'postdir': None,
'predir': None,
'suffix': 'AVE'},
'street_address': 'WELSH RD & FRANKFORD AVE',
'unit': {'unit_num': None, 'unit_type': None},
'uspstype': None,
'zip4': None,
'zipcode': None},
'input_address': 'WELSH RD AND FRANKFORD AVE',
'type': 'intersection_addr'}
Searching for 2401 PENNSYLVANIA AVE
without the include_units
flag will return:
2401 PENNSYLVANIA AVE
2401 PENNSYLVANIA AVE OFC
where the second one has a unit.
Works: https://api.phila.gov/ais/v1/addresses/1234%20market
Does not work: https://api.phila.gov/ais/v1/addresses/1234%20market/
Perhaps flask should ignore trailing slashes, or addresses should have invalid chars (like /
, $
, etc.) removed
If this is intended behaviour feel free to close it out, but owner searches can have multiple versions of the same address / account number, ie:
Which both resolve to the same account
Currently AIS has a single model for street segments, but for implementing a /search
endpoint that supports a generic street lookup we'll need a more abstract model called Street
that relates all the segs with the same street code. The fields should be:
predir: text (not null)
name: text (not null)
suffix: text (not null)
postdir: text (not null)
full: text (not null, indexed, unique)
code: integer (not null, indexed, unique)
Add a foreign key to StreetSegment
called street
that references a Street
object via its street code. This should be indexed as well. Remove fields in StreetSegment
that are covered in the new model (street_predir
, street_name
, street_suffix
, street_postdir
, street_code
).
Some of the engine scripts will need to be tweaked to use this model, most importantly load_streets
, load_dor_parcels
, and load_addresses
.
When I searched for a valid zipcode (e.g. 19143) I eventually got a 502 Bad Gateway. When I do the search directly against the API EC2 machine, after a long wait, I get, e.g.:
GET /addresses/19143
{
"query": "19143",
"normalized": [
null
],
"page": 1,
"page_count": 6592,
"page_size": 100,
"total_size": 659161,
"type": "FeatureCollection",
"features": [
...
]
}
The API currently searches for addresses, and will return the full normalized version of the address string in normalized
. It's returning null
because there is no address string here; Passyunk correctly identifies this as just a zipcode. However, the API then tries to go and query for everything, apparently ("total_size": 659161,
).
Noticed an error in Sentry where this line of the 400/500 error handler is throwing an error because the e
object doesn't have an attribute code
.
@app.errorhandler(404)
@app.errorhandler(500)
def handle_errors(e):
error = json_error(e.code, e.description, None) # happening here
return json_response(response=error, status=e.code)
This is a list of requirements that should be satisfied to launch the AIS API. Tests will be built around these requirements.
Can filter to return only addresses that have OPA numbers
A single base address that is not in a range returns a single result.
A range address query returns a single result (i.e., does not include
child addresses)
A query for an address that falls in a ranged address should return
a single result where the address is the one queried, but the
opa_address
field is the ranged address. For example, a query for
523 N Broad St will return the address:
{
"type": "Feature",
"properties": {
"street_address": "523 N BROAD ST",
"opa_address": "523-25 N BROAD ST",
...
},
"geometry": {
"type": "Point",
"coordinates": [
-75.16053912062759,
39.962545839472334
]
}
}
A query for a base address returns the base address as well as any units
in that base.
A query for a base address that is a range returns the units in all of
the child addresses for that range.
The API treats unit types of Apt, Unit, #, and Ste as interchangeable;
searching for one will search for any of them.
The API does not treat other unit types interchangeably; a unit type of
"floor" will only match "floor".
Related, e.g., to the following Argus issues:
500: INTERNAL SERVER ERROR
https://argus.phila.gov/phila/property-search/issues/705/
ValueError: Invalid zipcode
https://argus.phila.gov/phila/ais/issues/685/
Ranged addresses that overlap each other need to be flagged and reported.
2108-14 MARKET ST - OPA
2110-12 MARKET ST - DOR?
Non-ranged addresses
2108 MARKET ST
2110 MARKET ST
2112 MARKET ST
2114 MARKET ST
There are 75 records returned. about half do not have coordinates, the other half do
Searching for 8201 HENRY AVE APT 32B
returns a 404
because the address doesn't exist in the database. As a result, if you look up that address in the new AIS-backed Property Search you get 0 properties found
, even though 8201 HENRY AVE
is an OPA address. It seems like the current Property Search has the same behavior so this is "no harm done", but might be worth revisiting later.
is coming back as 19107-9997
. USPS has 19107-3727
(see here).
1930 CHESTNUT ST OFC
1930 CHESTNUT ST
1930-34 CHESTNUT ST
These addresses match pwd parcel 531391.
There are 108 APT addresses at this address that are true range matches - i.e. 1930 CHESTNUT ST APT 10B
Currently, service area values are calculated in make_service_area_summary
and stored in the service_area_summary
table, where each column corresponds to a service area. Since service areas can be added ro removed in config.py
, the table is generated dynamically on each run and does not have a fixed schema in models.py
. This eases the configuration process but has created problems with migrations and initializing new engine databases. It may be helpful to deprecate the service_area_summary
in favor of a service_area_tag
table with a key-value structure like;
| id | street_address | service_area_id | value |
=================================================================
| 1 | 12 OAK LN | census_block | 1 |
| 2 | 12 OAK LN | rubbish_day | Thurs |
@mjumbewu do you think this could be workable in terms of API performance? With the right indexes?
Add a create
statement for this view:
SELECT COALESCE(r.seg_id, l.seg_id) AS seg_id,
r.low AS true_right_from,
r.high AS true_right_to,
l.low AS true_left_from,
l.high AS true_left_to
FROM ( SELECT asr.seg_id,
min(a.address_low) AS low,
GREATEST(max(a.address_low), max(a.address_high)) AS high
FROM address a
JOIN address_street asr ON a.street_address = asr.street_address
GROUP BY asr.seg_id, asr.seg_side
HAVING asr.seg_id IS NOT NULL AND asr.seg_side = 'R'::text) r
FULL JOIN ( SELECT asl.seg_id,
min(a.address_low) AS low,
GREATEST(max(a.address_low), max(a.address_high)) AS high
FROM address a
JOIN address_street asl ON a.street_address = asl.street_address
GROUP BY asl.seg_id, asl.seg_side
HAVING asl.seg_id IS NOT NULL AND asl.seg_side = 'L'::text) l ON r.seg_id = l.seg_id
ORDER BY r.seg_id
Hey @mjumbewu ! I'm trying to get the API running on my local machine and couldn't remember exactly how you were using .env
files. I remember you had separate ones for dev/prod; how did that work again? If you give me some pointers I'll write them into the docs and maybe commit an .env.sample
.
Thanks!
They used to be at the top level, now they're in ais/ais/engine/bin/migrations
.
Searching for 1410-1430 GERMANTOWN
gives a 404. OPA API returns a number of addresses in that range.
The load_addresses
script is relating 1911 GREEN ST FL 2
to the property for 1911 GREEN ST # 2
, which is not really valid (and causing FL 2
to show up in Property Search even though it's not an OPA address). Only interchangeable units like #
, APT
, UNIT
, and STE
should be allowed to have generic unit relationships.
I assume that ideally the API would always return in the expected format (json). It certainly makes parsing/detecting errors easier.
Change update statement (ln. 341) to get opa_address from op.source_address instead of op.street_address
print('Populating OPA accounts...')
prop_stmt = '''
update address_summary asm
set opa_account_num = op.account_num,
opa_owners = op.owners,
opa_address = op.street_address
from address_property ap, opa_property op
where asm.street_address = ap.street_address and
ap.opa_account_num = op.account_num
'''
db.execute(prop_stmt)
db.save()
These addresses are matched to PWD parcel 996352.
1935 CHESTNUT ST APT 2F
1935 CHESTNUT ST APT 2R
1935 CHESTNUT ST APT 3F
1935 CHESTNUT ST APT 3R
1935 CHESTNUT ST
1935-37 CHESTNUT ST
These addresses match to DOR Parcel 001S110288
1935 CHESTNUT ST
1935-37 CHESTNUT ST
1937 CHESTNUT ST - dor_parcel match
These addresses are true_range
1937 CHESTNUT ST # 1
1937 CHESTNUT ST APT 2F
1937 CHESTNUT ST APT 2R
1937 CHESTNUT ST APT 3F
1937 CHESTNUT ST APT 3R
1937 CHESTNUT ST APT 4F
When load_addresses
encounters a range address with a unit num like 1234-36 CHESTNUT ST # 200
, it adds the base address 1234-36 CHESTNUT ST
but it doesn't create the child addresses for that range. Note that those addresses should not have the unit number of the original address -- use address_link
instead to make that connection.
AIS should use logging
and have some framework in place for managing/shipping logs.
MATCH TYPES
exact
base_address
unit_child
: if we search for 1769 FRANKFORD AVE
specifying include_units
, return 1) an exact
match for the base address followed by all children with type unit_child
unit_sibling
: assume AIS has # 4 and APT 4. if we search for UNIT 4, it should return # 4 and APT 4 with match type same_unit
. Use matches_unit
relationship type from address_link
.unmatched
: if we search for an address that doesn't exist in AIS, drop down to centerline geocode and return service areas.sample response:
{
"search_type": "address", TODO
"search_params": { TODO
},
"query": "1769 frankford ave",
"normalized": "1769 FRANKFORD AVE",
"page": 1,
"page_count": 1,
"page_size": 1,
"total_size": 1,
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"match_type": TODO (exact/base_address/unit_child/unit_sibling/unmatched)
"ais_feature_type": "address",
"properties": {
"street_address": "1769 FRANKFORD AVE",
"address_low": 1769,
"address_low_suffix": "",
"address_low_frac": "",
"address_high": null,
"street_predir": "",
"street_name": "FRANKFORD",
"street_suffix": "AVE",
"street_postdir": "",
"unit_type": "",
"unit_num": "",
"street_full": "FRANKFORD AVE",
"street_code": 34960,
"seg_id": 543011,
"zip_code": "19125",
"zip_4": "2422",
"usps_bldgfirm": "",
"usps_type": "S",
"election_block_id": "24010847",
"election_precinct": "1810",
"pwd_parcel_id": "128133",
"dor_parcel_id": null,
"li_address_key": "748166",
"pwd_account_nums": [
"3496001769001"
],
"opa_account_num": null,
"opa_owners": null,
"opa_address": null,
"geom_type": "centroid",
"geom_source": "pwd_parcel",
"center_city_district": "",
"cua_zone": "Asociaci\u00f3n Puertorrique\u00f1os en Marcha for Everyon",
"li_district": "North",
"philly_rising_area": "",
"census_tract_2010": "015800",
"census_block_group_2010": "1",
"census_block_2010": "1009",
"council_district_2016": "5",
"political_ward": "18",
"political_division": "1810",
"planning_district": "River Wards",
"elementary_school": "Adaire",
"middle_school": "Adaire",
"high_school": "Penn Treaty",
"zoning": "RM1",
"police_division": "EPD",
"police_district": "26",
"police_service_area": "263",
"recreation_district": "6",
"rubbish_recycle_day": "FRI",
"recycling_diversion_rate": 0.062,
"leaf_collection_area": "Saturday Bag Dropoff",
"sanitation_area": "3",
"sanitation_district": "3F",
"historic_street": "",
"highway_district": "3",
"highway_section": "3F",
"highway_subsection": "3F10",
"traffic_district": "1",
"traffic_pm_district": "1212",
"street_light_route": "48",
"pwd_maint_district": "3E",
"pwd_pressure_district": "TLS",
"pwd_treatment_plant": "BAXTER",
"pwd_water_plate": "39",
"pwd_center_city_district": "",
"related_addresses": [ TODO
{
"address": "",
"relationship": "range_child/range_parent/base_address/same_unit"
},
...
]
},
"geometry": {
"geocode_type": "",
"type": "Point",
"coordinates": [
-75.131696154867,
39.976398436979
]
}
}
]
}
See 126-30 N 10TH ST
. Error table says it's matching to multiple seg IDs, but one of those has a range of 100-120. Should not be matching to that.
Otherwise centerline interpolation is more accurate.
ULRS resolves TAGGERT SCHOOL
to 1701-47 CHELTEN AVE
. AIS API is not handling this currently.
After #63 is complete, we'll need a view that takes a street_full
and returns the corresponding Street
object. The response should look like:
{
"query": "MARKET STREET",
"normalized": [
"MARKET ST"
],
"page": 1,
"page_count": 1,
"page_size": 1,
"total_size": 1,
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"street_predir": "",
"street_name": "MARKET",
"street_suffix": "ST",
"street_postdir": "",
"street_code": 53560
},
"geometry": {
"type": "Point",
"coordinates: [
<midpoint x>,
<midpoint y>
]
}
]
}
For the geometry, query all the segs with that street code and find the midpoint (snapped to a seg).
to import lists of suffixes and directionals.
Related to issue CityOfPhiladelphia/property2#153.
11 NORTH 2ND STREET # 501
matches to 11-15 N 2ND ST #501
in the OPA API but has no match in AIS.
There is a parcel 621-25 REED ST in both PWD and DOR that this should be geocoding to.
Addresses with a R
suffix should appear after the base address in results, e.g.:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.