Dissolving feature/field by interval using ArcGIS Desktop?
Dissolving feature/field by interval using ArcGIS Desktop?
I have a huge dataset with flowpaths containing a catchment area for each line. The attribute table has ~ 780.000 rows, and thereby 780.000 features in the mapwindow.
Is it possible by using a field to merge the data in intervals, so you get a multi-part feature instead?
For instance merge all the features so those with a catchment area of 1-10.000 become one, from 10.001-20.000, 20.001-30000 and so on? The interval doesn't have to be 10.000 but can be another value.
I was messing around with the dissolve feature, but other than doing a manual population of a field and then dissolving by that attribute, I couldn't find another way to do it.
Basically I want what the symbology window does, but I want to do it to the dataset.
I have ArcGIS Desktop 10.5.1 with the Advanced License.
Edit:
import pandas as pd
import arcpy
fc = r'C:UsersJGJDesktopDatabaseVejle.gdbStromningsVeje' #Change to match your data
bins = [-1,10000,50000,100000,200000,500000,1000000,5000000,10000000,100000000,1000000000] #Add/remove bins
df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor(fc,['OID@','flow']),columns=['OID','flow'])
df['Interval'] = pd.cut(df['flow'], bins=bins).astype(str)
intervaldictionary = dict(zip(df.OID,df.Interval))
with arcpy.da.UpdateCursor(fc,['OID@','Intervalfield']) as cursor:
for row in cursor:
if row[0] in intervaldictionary:
row[1] = intervaldictionary[row[0]]
else:
row[1] = 'Unknown interval'
cursor.updateRow(row)
Ran into a snack, when processing a large dataset:
Runtime error
Traceback (most recent call last):
File "<string>", line 7, in <module>
File "C:Python27ArcGIS10.5libsite-packagespandascoreframe.py", line 950, in from_records
values += data
MemoryError
Any suggestions on how to get passed this?
@BERA Added an error message I'm receiving, with a new dataset.
– FoolzRailer
Sep 16 '18 at 18:36
Ive added a part showing how to skip the pandas part.
– BERA
Sep 17 '18 at 5:02
Yes it sounds very wierd. Do you have many million rows? You could calculate a field like area/10000 and then use this areafield in the code. Maybe it will not consume as much ram if some of the zeroes are left out. No that is not needed, it is enough with elif < 50000 etc.
– BERA
Sep 17 '18 at 5:59
Nice. You could try splitting the data, dissolve each part then merge all and dissolve.
– BERA
Sep 17 '18 at 8:42
1 Answer
1
Create a new field of intervals and assign an interval to each record. Then dissolve by interval field. This can be done with field calculator and many if-elif-elses or you can use pandas module (which is included in ArcGIS 10.5) and pandas.cut:
Use cut when you need to segment and sort data values into bins
Add a text field named Intervalfield, modify indicated lines and execute code in the python window of ArcMap. Then Dissolve by Intervalfield. If you want to use some other areafield instead of shapearea, replace SHAPE@AREA
with the name of your field:
SHAPE@AREA
import pandas as pd
import arcpy
fc = r'C:data.gdbfeatureclass' #Change to match your data
bins = [-1,10,20,30,40,50,9999] #Add/remove bins
df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor(fc,['OID@','SHAPE@AREA']),columns=['OID','Area'])
df['Interval'] = pd.cut(df['Area'], bins=bins).astype(str)
intervaldictionary = dict(zip(df.OID,df.Interval))
with arcpy.da.UpdateCursor(fc,['OID@','Intervalfield']) as cursor:
for row in cursor:
if row[0] in intervaldictionary:
row[1] = intervaldictionary[row[0]]
else:
row[1] = 'Unknown interval'
cursor.updateRow(row)
After code you will have output below, then dissolve.
If you get memoryerror you are out of ram. Then skip the pandas part and use the da.UpdateCursor with if-elif-else:
import arcpy
fc = r'C:data.gdbfeatureclass' #Change to match your data
areafield = 'SHAPE@AREA'
intervalfield = 'Interval'
with arcpy.da.UpdateCursor(fc,[areafield,intervalfield]) as cursor:
for row in cursor:
if row[0]<10000:
row[1] = '1' #Or '0-10000', whatever you want
elif row[0]<50000:
row[1] = '2'
#More elifs...
else:
row[1] = '9999'
cursor.updateRow(row)
This works perfectly, thank you. As a bonus question out of curiosity and it isn't required as I can use this fine, but is it possible to put this into a toolbox, where you can choose your input file and intervals instead of "hardcoding" it?
– FoolzRailer
Sep 14 '18 at 11:21
@FoolzRailer nice! Im sure it is, but I dont have much exprience with toolboxes. You could post this as a new question including what you tried and why it did not work.
– BERA
Sep 14 '18 at 11:23
Thanks for contributing an answer to Geographic Information Systems Stack Exchange!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
@BERA ArcGIS Desktop 10.5.1 with Spatial Analyst and Advanced License.
– FoolzRailer
Sep 14 '18 at 9:00