GithubHelp home page GithubHelp logo

distributed-matrix-completion's People

Contributors

kmu-leeky avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

distributed-matrix-completion's Issues

Matrix outer product 방식 Spark에 구현

출저: https://www.thinkbiganalytics.com/2015/11/23/scalable-matrix-multiplication-using-spark-2/
eb463221-6e4b-4fb3-82dc-3dcc1be60ca3
n 이 매우 커지는경우 메모리 초과 발생.
Spark 에 여러 Distriyted matrix 있지만, 매개 변수로 로컬 행렬만 사용. 즉, 메모리 부족 문제가 발생하기 때문에 MLlib을 사용하면 큰행렬에 다른 행렬을 곱할 수 없음.

이 문제를 해결하기 위해 outer product 개념 활용.
ada3d9ca-a50f-4f67-8b53-453e010e748e

다음과 같이 A의 col vector, B의 row vector로 n개 만큼의 RDD 구성.
각각의 RDD에는 m+k 만큼의 element 존재.
6f663a53-14fa-459a-af9d-9958a94b1957

ex) 다음과 같은 행렬을 outer product로 곱할때,
c9340d7a-6879-4954-a798-680f5bece96a

다음과 같이 진행됨
621115a0-8036-4bc1-9d96-bb3d3c166510

소스코드: https://www.thinkbiganalytics.com/wp-content/uploads/2017/03/SparkMultiplication-1.txt

square 매트릭스 간의 곱셈 성능 측정

Spark MLLib BlockMatrix.multiply 함수를 활용하여
두개의 square 매트릭스간의 곱셈 연산에 대한 각기 다른 설정에 따른 성능 측정 필요

매트릭스의 사이즈는
1K1K
2K
2K
4K4K
8K
8K
16K16K
32K
32K

매트릭스 안의 블락 개수는
11, 22, 44, 88, 1616, 18, 81, 116, 16*1 등을 시도하면 될듯

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.