GithubHelp home page GithubHelp logo

s-b-iqbal / predicting-power-output-of-a-combined-cycle-power-plant. Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 6.0 1.81 MB

The objective of the Project is to predict ‘Full Load Electrical Power Output’ of a Base load operated combined cycle power plant using Polynomial Multiple Regression. Concepts : 1) Clustering, 2) Polynomial Regression, 3) LASSO, 4) Cross-Validation, 5) Bootstrapping

Jupyter Notebook 100.00%
k-means-clustering silhouette polynomial-regression lasso cross-validation bootstrapping-statistics

predicting-power-output-of-a-combined-cycle-power-plant.'s Introduction

Predicting Full Load Electrical Output of a Combined Cycle Power Plant using Polynomial Linear Regression.

Objective :

The objective of the Project is to predict ‘Full Load Electrical Power Output’ of a Base load operated combined cycle power plant using Polynomial Multiple Regression. In the current project I have used Clustering to demonstrate a relationship between the various variables at play. Also, I have employed Cross-Validation to find the most efficient hyper-parameters for the model. Furthermore, I have demonstrated the use of ‘LASSO’ for dimension reduction. Finally, I have wrapped up by applying Bootstrapping in order to assess the accuracy of the model on the Test DataSet.

Motivation:

Predicting full load electrical power output of a base load power plant is important in order to maximize the profit from the available megawatt hours. The base load operation of a power plant is influenced by four main parameters, which are used as input parameters in the dataset, such as Ambient Temperature, Atmospheric Pressure, Relative Humidity and Exhaust Steam Pressure.

Data Collection:

URL : [https://archive.ics.uci.edu/ml/datasets/combined+cycle+power+plant]

The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant. A combined cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST) and heat recovery steam generators. In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is colected from and has effect on the Steam Turbine, he other three of the ambient variables effect the GT performance. For comparability with our baseline studies, and to allow 5x2 fold statistical tests be carried out, we provide the data shuffled five times. For each shuffling 2-fold CV is carried out and the resulting 10 measurements are used for statistical testing. We provide the data both in .ods and in .xlsx formats. Relevant Papers to cite: Pınar Tüfekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods, International Journal of Electrical Power & Energy Systems, Volume 60, September 2014, Pages 126-140, ISSN 0142-0615, http://dx.doi.org/10.1016/j.ijepes.2014.02.027.

(http://www.sciencedirect.com/science/article/pii/S0142061514000908) Heysem Kaya, Pınar Tüfekci , Sadık Fikret Gürgen: Local and Global Learning Methods for Predicting Power of a Combined Gas & Steam Turbine, Proceedings of the International Conference on Emerging Trends in Computer and Electronics Engineering ICETCEE 2012, pp. 13-18 (Mar. 2012, Dubai)

Workflow :

1. **Exploratory Data Analysis**
2. Clustering :
    - K-Means Clustering.
        - Identification of optimal K.
        - Silhouette Analysis
3. Polynomial Multiple Regression:
    - a.	Data Modelling.
    - b.	Division of dataset into: Train, Test and Validation set.
    c.	Cross-validation to find the optimum degree ‘n’ for the polynomial regression.
    d.	LASSO for dimension reduction:
        -	Cross-Validation on Training and Validation set to find the best ‘alpha’ for LASSO reduction.
    e.	Model Evaluation on the Test data using the metrics R^2, and adjusted R^2.
4.  Bootstrapping : Confidence Interval of R^2 for Test Data.

Results:

1. EDA helps in giving a preliminary glimpse on how various factors are affecting the Power output.
2. Clustering showcases what factors are responsible for a higher Power output:
    - It suggests to increase Power - Humidity and Pressure should also be increased.
    - shows that for high levels of power to be generated, the Plant Temperature and Vacuum levels should be as low as possible.
3. 10-fold Cross-Validation helps in finding the most optimum degree for Polynomial Regression and level of 'alpha' in the LASSO model.
4. LASSO shows what parameters are important in the final model.
5. Bootstrapping reflects the confidence Interval of Accuracy for the model for unseen data.

Conclusion:

By tweaking the hyper-parameters using cross-validation and applying LASSO for getting the most important dimensions, The model is able to achieve an accuracy of 93% on Test Data. Thus, we can use this model for predicting with high accuracy what would be the Power output of a Combined Cyle Power Plant. This can substantially bring down the cost of production by controlling the input parameters of the plant and lead to increased efficiency.

predicting-power-output-of-a-combined-cycle-power-plant.'s People

Contributors

s-b-iqbal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.