Automated classification of female uroflowmetry curve patterns.

Sorel M1, Baas S2, Rosier P F W M1, Bosch R1, Brand E2, van der Kamp M2, Boele D2, Huizinga E2, Geurts B2, de Kort L M O1

Research Type

Pure and Applied Science / Translational

Abstract Category

Female Lower Urinary Tract Symptoms (LUTS) / Voiding Dysfunction

Abstract 338
Open Discussion ePosters
Scientific Open Discussion Session 21
Thursday 30th August 2018
13:45 - 13:50 (ePoster Station 2)
Exhibition Hall
Female Mathematical or statistical modelling Urodynamics Techniques
1. University Medical Center Utrecht, The Netherlands, 2. Twenty University, The Netherlands
Presenter
L

Laetitia M O de Kort

Links

Poster

Abstract

Hypothesis / aims of study
Uroflowmetry is a widely used non invasive diagnostic test in patients with lower urinary tract dysfunction. In women, the pattern of the uroflowmetry curve (UFC)  is one of the uroflowmetry outcomes. However, there is no structured and objective manner to classify the uroflowmetry curve pattern. 
Computerized analysis of UFC patterns and automated UFC pattern classification may lead to a more standardized and objective analysis of UFC’s. Since a more objective assessment of UFC might contribute to a correct diagnosis of female voiding dysfunction, we aim to develop an automated system for the classification of UCF patterns.
Study design, materials and methods
The development of the automated classification has progressed in stages and was based on three different databases. The first dataset consisted of UFC's of healthy young women, the second dataset were UFC's of free flows prior to invasive urodynamic study in female patients > 18 years with lower urinary tract symptoms. The third dataset was an extension of the second dataset. The sets consisted of increasing numbers of UFC's and increasing proportions of abnormal flow rates. UFC’s were allocated by experts in functional urology to one of the four categories: normal (A), staccato )B), interrupted (C) or long flow (D).  The certainty of allocation was recorded. The UFC's scored with the highest certainty were identified as reference curves. Datasets were tested in three systems: first, a questionnaire algorithm, secondly an optimized parameter system algorithm and finally, a machine learning system. The questionnaire algorithm was based on qualifying or quantifying criteria taken from the literature regarding UFC patterns in combination with the expert allocation. For the optimized parameter algorithm criteria from the initial questionnaire algorithm were added or removed and thresholds were adapted in order to increase agreement. As UFC's may be a combination of basic flow patterns, for the machine learning algorithm, UFC's were not allocated to one pattern category, but likelihood assignment was performed.  A group of experts assessed to what extend UFC’s matched each of the four basic UFC patterns. This resulted in a third classified dataset of UFC’s, each with a specific likelihood assignment. This dataset was used to develop a system to generate ‘machine-likelihoods’ that were able to reproduce the experts assignments. We constructed a set of classifiers that was optimized to give the best possible correspondence between the reference curves and their respective classes. Different classifiers were constructed and their performance was compared. The sensitivity score (S-score) and the ROC- AUC were computed.The questionnaire algorithm and the optimized algorithm were also applied to the third (largest) dataset. The ability of each algorithm to reliably identify the correct UFC pattern was tested.
Results
Questionnaire algorithm. 
From the first set of UFC’s, 159  remained after exclusion of technically improper curves and the experts classified 90 of these UFC’s. Forty-eight out of 90 UFC’s were classified as reference curves; 44 of which  pattern A, 1 pattern B, 2 pattern C and 1 pattern D. The inter-observer agreement was fair (kappa 0.20 and 0.40)
The testing of the final optimal questionnaire algorithm with this set of reference curves resulted in a S-score of 0.98 with only one UFC classified incorrectly. 

Optimized algorithm: 
The experts classified 365 UFC’s of the second set.  The inter-observer agreement was moderate,  kappa 0.54, 0.51, 0.70. Of these 138, the reference curves were: 87 pattern A, 19 pattern B, 22 pattern C and 10 pattern D. 
The final optimized questionnaire resulted in a S-score of 0.94 and an AUC of 0.9892. 

Machine system
With the machine system, a total of 708 UFC’s with likelihood assignments given by experts were used;  428 UFC’s were labeled as reference curves and 71% belonged to pattern A, 12% to pattern B, 14% to pattern C and 3% to pattern D. The machine system resulted in an S-score of 0.98 and AUC almost 1.00 for all four UFC types.

Regression forest classifier resulted in almost perfect agreement between this machine classification and the database of reference curves. The machine system with the regression forest classifier was confirmed to generate the best results.  The machine system classified nine UFC’s differently than the experts; there were discrepancies between pattern A and C (five UFC’s) or between A and B (two UFC’s).
Interpretation of results
This study reports the process of the development of a method to digitally analyze uroflowmetry curves of women as a first stage of standardized and objective assessment of UFC patterns. The S-scores of the consecutive algorithms increased with the successive steps.
Within the different datasets, with different numbers of UFC’s and differences in heterogeneity of flow patterns, the questionnaire algorithm, the optimized algorithm and the machine system approach all yielded almost optimal results to classify the UFC’s. However, the machine system with regression forest classifier performed adequate on the largest and most heterogenic database. With the machine system, not only the reference curves, for which the classification is the most apparent, were evaluated, but also the correspondence between the output classifiers and the ‘true’ likelihoods were evaluated, thus providing the opportunity to refine the assessment of additional features of the different UFC’s patterns. Since a regression forest classifier has little dependence on heuristic input we speculate that this method has the highest potential for future clinical usage.
Concluding message
We report the initial developed an automated system for the classification of female uroflowmetry curves into one of four patterns. Early designs of the system learned to classify selected UFC’s that showed perfect expert agreement (reference curves). After three rounds of improving, a machine learning (regression forest classifier) system resulted in nearly perfect classification of the reference curves but also of a larger more random set.
Disclosures
Funding None Clinical Trial No Subjects Human Ethics Committee Ethical Committee of the University Medical Center Utrecht Helsinki Yes Informed Consent No
19/04/2024 20:48:37