gher-uliege / dincae Goto Github PK
View Code? Open in Web Editor NEWDINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to reconstruct missing data in satellite observations.
License: GNU General Public License v3.0
DINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to reconstruct missing data in satellite observations.
License: GNU General Public License v3.0
I'm sorry to ask about your paper that doesn't relate to DINCAE.
Is there any reason to deduct the time average without cross-validation? As I understand this sentence, cross-validation data will be considered independent data from the total length of the study period. Am I right? or Does any reasons for conduct it?
Kind Regards,
Sihun Jung
Hello, after reading your paper and code,I tried using DINCAE1.0 to reconstruct remote sensing data, and so far I have successfully trained and saved the 500th and 1000th training models, which are some network parameters, such as .ckpt files.
I have a question, as mentioned in the paper, the data is divided into training data and testing data. During the training process, additional Gaussian noise and other cloud masks are added to the training data, while nothing extra is added to the testing data. In addition, 50 images were used for validation data, but these 50 images did not participate in model training.So according to my understanding, should we use the trained model parameter file (. ckpt) to reconstruct these 50 validation data after training?
My current approach is to add this line of code in the reconstruct function of the code: save. restore (sess,'E:/DINCAE/DINCAE master/model/model-1000. ckpt '), which is added before "loop over epochs".I will create inputdata from 50 validation data and then run DINCAE. At this point, can I understand it as using the trained network model parameters to reconstruct the validation data?I only saw the functions for training the network in your code, and did not find how to directly reconstruct the validation data using the trained network parameters to verify the effectiveness of the model.
In summary, my problem is that I don't know how to use the already trained. ckpt parameter files. My current approach is to load these parameter files, then run DINCAE, and use the 1/1000 epoch output file as the reconstructed image.The reason why I did this is that the code outputs the reconstruction results of the test data, and the test data does not add anything extra. Therefore, my input data is 50 validation data, and at this point, run DINCAE stores the reconstruction results of the validation images.
Sorry, I am a beginner and not very familiar with deep learning. I have said too much here. I sincerely request and hope for your answer. Thank you very much!
Describe the bug
A description of what the bug is.
To Reproduce
I am trying to setup and run DINCAE but am getting an error
Environment
Full output
varname SST (112, 112)
data shape: (100, 112, 112)
data range: 13.575001 30.225
sz (100, 112, 112)
sz (100, 112, 112)
Number of input variables: 10
regularization_L2_beta 0
enc_nfilter_internal [16, 24, 36, 54]
nvar 10
nepoch_keep_missing 0
Traceback (most recent call last):
File "E:\D BackUp\PPL Works\UdaySir\DINCAE-master\run_DINCAE.py", line 14, in
DINCAE.reconstruct_gridded_nc(filename,varname,outdir)
File "E:\D BackUp\PPL Works\UdaySir\DINCAE-master\DINCAE_init_.py", line 741, in reconstruct_gridded_nc
**kwargs)
File "E:\D BackUp\PPL Works\UdaySir\DINCAE-master\DINCAE_init_.py", line 405, in reconstruct
train_iterator_handle = sess.run(train_iterator.string_handle())
AttributeError: 'OwnedIterator' object has no attribute 'string_handle'
Input file
Filename : " avhrr_sub_add_clouds_small.nc "
Describe the bug
A description of what the bug is.
To Reproduce
Please provide a minimal code example which reproduces the behavior (bug, performance regression, ...).
Environment
Full output
In case of an error, please paste the full error message and stack trace.
The file output attached record_modis_chl_3d.txt
Input file
Run the shell command ncdump -h myfile.nc
and paste the output here.
`netcdf input_file_python {
dimensions:
lon = 445 ;
lat = 459 ;
time = UNLIMITED ; // (3 currently)
variables:
float lon(lon) ;
float lat(lat) ;
float time(time) ;
time:units = "days since 2021-01-01 00:00:00" ;
float chl(time, lat, lon) ;
chl:_FillValue = -9999.f ;
int mask(lat, lon) ;
mask:comment = "one means the data location is valid (e.g. sea for SST), zero the location is invalid (e.g. land for SST)" ;
// global attributes:
:_NCProperties = "version=2,netcdf=4.8.1,hdf5=1.12.2" ;
}`
In the paper, how were the last 50 images used to test the model ?
the network is trained by using 5266 images, then the cloud mask of the first 50 images is used in the last 50 images, the trained model is validated on these 50 images.
Is the testing process like this?
Hello, I am a beginner. How should I make the input data set for DINCAE? I now have some global SST nc data sets. Should I merge them into one nc file and then recreate the data set required by DINCAE? I'm confused about this because of create_input_file.py in examples
This is not bug question, rather a newbie question.
I don't understand what to do next after create create_input_file.jl and, how it work with multifiles MODIS overthere?
is that correlate with append mode like in the NCDatasets.jl does? to aggregate sst variables?
Thank you
### Before I ask about my problem, I really appreciate a kind response from the author, Thank you.
I continuously fail to adapt the trained DINCAE model on the test dataset when I run the pre-trained model on the test dataset, an unidentified error message came from python. May I ask how can I adapt the pre-trained model?
Can anyone succeed in my question?
model_call_path="/share/ocean/SSS/SMAP_SSS_JPL_L2/G_2_stack_map_EA_3X3window/result/window5_RF_only/"
os.chdir(model_call_path)
sess = tf.compat.v1.Session()
saver=tf.train.import_meta_graph('./best_model.ckpt.meta')
saver.restore(sess,tf.train.latest_checkpoint(model_call_path))
# test dataset without added clouds
# must be reinitializable
test_dataset = tf.data.Dataset.from_generator(
test_datagen, (tf.float32,tf.float32),
(tf.TensorShape([jmax,imax,nvar]),tf.TensorShape([jmax,imax,2]))).batch(batch_size)
if nprefetch > 0:
# train_dataset = train_dataset.prefetch(nprefetch)
test_dataset = test_dataset.prefetch(nprefetch)
test_iterator = tf.compat.v1.data.Iterator.from_structure(test_dataset.output_types,
test_dataset.output_shapes)
test_iterator_init_op = test_iterator.make_initializer(test_dataset)
test_iterator_handle = sess.run(test_iterator.string_handle())
handle = tf.compat.v1.placeholder(tf.string, shape=[], name = "handle_name_iterator")
iterator = tf.compat.v1.data.Iterator.from_string_handle(
handle, test_iterator.output_types, output_shapes = test_iterator.output_shapes)
inputs_,xtrue = iterator.get_next()
(skip build structure code)
timestr = datetime.now().strftime("%Y-%m-%dT%H%M%S")
fname = os.path.join(outdir,"data-{}.nc".format(timestr))
# reset test iterator, so that we start from the beginning
sess.run(test_iterator_init_op)
for ii in range(ceil(test_len / batch_size)):
summary, batch_cost,batch_RMS,batch_m_rec,batch_σ2_rec = sess.run(
[merged, cost,RMS,m_rec,σ2_rec],
feed_dict = { handle: test_iterator_handle,
mask_issea: mask })
# time instances already written
offset = ii*batch_size
savesample(fname,batch_m_rec,batch_σ2_rec,meandata,lon,lat,e,ii,
offset, transfun = transfun)
Error message
FailedPreconditionError: 2 root error(s) found.
(0) Failed precondition: Attempting to use uninitialized value conv2d_3/kernel_1
[[node conv2d_3/kernel_1/read (defined at :506) ]]
[[Sqrt_2/_131]]
(1) Failed precondition: Attempting to use uninitialized value conv2d_3/kernel_1
[[node conv2d_3/kernel_1/read (defined at :506) ]]
0 successful operations.
0 derived errors ignored.
I'm really want to adapt it for a separated cross-validation dataset.
from the paper
"sess" was called above picture and run the trained model using the session as I mentioned first.
Describe the bug
Error when running the original examples
To Reproduce
Please provide a minimal code example which reproduces the behavior (bug, performance regression, ...).
Environment
Full output
In case of an error, please paste the full error message and stack trace.
`
2023-02-04 11:40:35.989971: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-04 11:40:37.278685: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-04 11:40:37.279681: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2023-02-04 11:40:37.280860: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance
varname SST (3, 3)
data shape: (3, 3, 3)
data range: 0.0 0.0
sz (3, 3, 3)
sz (3, 3, 3)
Number of input variables: 10
regularization_L2_beta 0
enc_nfilter_internal [16, 24, 36, 54]
nvar 10
nepoch_keep_missing 0
Traceback (most recent call last):
File "/home/ocean/DINCAE-master/run_DINCAE.py", line 10, in
DINCAE.reconstruct_gridded_nc(filename,varname,outdir)
File "/home/ocean/DINCAE-master/DINCAE/init.py", line 733, in reconstruct_gridded_nc
reconstruct(
File "/home/ocean/DINCAE-master/DINCAE/init.py", line 404, in reconstruct
train_iterator_handle = sess.run(train_iterator.string_handle())
AttributeError: 'OwnedIterator' object has no attribute 'string_handle'
`
Input file
Run the shell command ncdump -h myfile.nc
and paste the output here.
`
ocean@ocean-HP-EliteDesk-800-G1-TWR:~/DINCAE-master/examples$ ncdump -h input_file_python.nc
netcdf input_file_python {
dimensions:
lon = 3 ;
lat = 3 ;
time = UNLIMITED ; // (3 currently)
variables:
float lon(lon) ;
float lat(lat) ;
float time(time) ;
time:units = "days since 1900-01-01 00:00:00" ;
float SST(time, lat, lon) ;
SST:_FillValue = -9999.f ;
int mask(lat, lon) ;
mask:comment = "one means the data location is valid (e.g. sea for SST), zero the location is invalid (e.g. land for SST)" ;
// global attributes:
:_NCProperties = "version=2,netcdf=4.8.1,hdf5=1.10.6" ;
}
`
Describe the bug
Trying to run DINCAE on CPR observations.
Stacktrace:
[1] throw_boundserror(A::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, I::Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}})
@ Base ./abstractarray.jl:703
[2] checkbounds
@ ./abstractarray.jl:668 [inlined]
[3] _getindex
@ ./multidimensional.jl:874 [inlined]
[4] getindex(A::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, I::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64})
@ Base ./abstractarray.jl:1241
[5] loadragged(ncvar::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, index::Colon)
@ NCDatasets ~/.julia/packages/NCDatasets/gTGnf/src/variable.jl:152
[6] (::DINCAE.var"#131#132"{String})(ds::NCDataset{Nothing})
@ DINCAE ~/.julia/packages/DINCAE/OlSY0/src/points.jl:442
[7] NCDataset(f::DINCAE.var"#131#132"{String}, args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ NCDatasets ~/.julia/packages/NCDatasets/gTGnf/src/dataset.jl:241
[8] NCDataset
@ ~/.julia/packages/NCDatasets/gTGnf/src/dataset.jl:238 [inlined]
[9] loaddata
@ ~/.julia/packages/DINCAE/OlSY0/src/points.jl:441 [inlined]
[10] reconstruct_points(T::Type, Atype::Type, filename::String, varname::String, grid::Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, fnames_rec::Vector{String}; epochs::Int64, batch_size::Int64, truth_uncertain::Bool, enc_nfilter_internal::StepRange{Int64, Int64}, skipconnections::UnitRange{Int64}, clip_grad::Float64, regularization_L1_beta::Int64, regularization_L2_beta::Float64, save_epochs::StepRange{Int64, Int64}, upsampling_method::Symbol, probability_skip_for_training::Float64, jitter_std_pos::Tuple{Float32, Float32}, ntime_win::Int64, learning_rate::Float64, learning_rate_decay_epoch::Int64, min_std_err::Float64, loss_weights_refine::Tuple{Float64}, auxdata_files::Vector{NamedTuple{(:filename, :varname, :errvarname), Tuple{String, String, String}}}, savesnapshot::Bool)
@ DINCAE ~/.julia/packages/DINCAE/OlSY0/src/points.jl:545
[11] top-level scope
so the problem comes at the reading step with the function DINCAE.loaddata(filename,varname)
:
https://github.com/gher-uliege/DINCAE.jl/blob/main/src/points.jl#L440-L457,
which uses loadragged
.
Why does the input has to be in written as contiguous ragged array representation?
To Reproduce
Please provide a minimal code example which reproduces the behavior (bug, performance regression, ...).
Environment
Input file
netcdf CPRdata {
dimensions:
obs = 250021 ;
depth = 1 ;
variables:
double time(obs) ;
time:_CoordinateAxisType = "Time" ;
string time:actual_range = "1958-01-01", "2020-12-23" ;
time:axis = "T" ;
time:calendar = "Gregorian" ;
time:ioos_category = "Time" ;
time:long_name = "Valid Time GMT" ;
time:standard_name = "time" ;
time:time_origin = "01-JAN-1900 00:00:00" ;
time:units = "days since 1900-01-01T00:00:00Z" ;
double latitude(obs) ;
latitude:_CoordinateAxisType = "Lat" ;
latitude:_FillValue = -999. ;
latitude:actual_range = 28.045f, 90.f ;
latitude:axis = "Y" ;
latitude:ioos_category = "Location" ;
latitude:latitude_reference_datum = "geographical coordinates, WGS84 projection" ;
latitude:long_name = "Latitude" ;
latitude:missing_value = -999. ;
latitude:standard_name = "latitude" ;
latitude:units = "degrees_north" ;
latitude:valid_max = 90. ;
latitude:valid_min = -90. ;
double longitude(obs) ;
longitude:_CoordinateAxisType = "Lon" ;
longitude:_FillValue = -999. ;
longitude:actual_range = -15.4244, 180.002 ;
longitude:axis = "X" ;
longitude:ioos_category = "Location" ;
longitude:long_name = "Longitude" ;
longitude:longitude_reference_datum = "geographical coordinates, WGS84 projection" ;
longitude:missing_value = -999. ;
longitude:standard_name = "longitude" ;
longitude:units = "degrees_east" ;
longitude:valid_max = 180. ;
longitude:valid_min = -180. ;
double Calanus_Finmarchicus(obs) ;
Calanus_Finmarchicus:_FillValue = -999. ;
Calanus_Finmarchicus:actual_range = 0., 5012. ;
Calanus_Finmarchicus:coordinates = "time" ;
Calanus_Finmarchicus:long_name = "Abunance of Calanus Finmarchicus" ;
Calanus_Finmarchicus:sdn_parameter_urn = "SDN:P01::Z302M01Z" ;
Calanus_Finmarchicus:sdn_parameter_name = "Abundance of Calanus finmarchicus (ITIS: 85272: WoRMS 104464) per unit volume of the water body by optical microscopy" ;
Calanus_Finmarchicus:sdn_uom_urn = "SDN:P06::UCPL" ;
Calanus_Finmarchicus:sdn_uom_name = "Individuals per cubic meter" ;
Calanus_Finmarchicus:AphiaID = "104464" ;
Calanus_Finmarchicus:missing_value = -999. ;
Calanus_Finmarchicus:units = "Individuals per cubic meter" ;
Calanus_Finmarchicus:sample_dimension = "obs" ;
...
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.